normal distribution introduction compare to discrete variables no. of doctor’s visits during the...

45
Normal Distribution Introduction

Upload: patrick-dawson

Post on 28-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Normal Distribution

Introduction

Compare toDiscrete Variables

• No. of Doctor’s Visits During the Year

• No. of Patients P No. of Visits

400 0.14 0

950 0.34 1

850 0.30 2

600 0.21 3+

2800 0.99

Histograms

• The height of each bar represents the probability of that event

• If each bar is one unit in width, then the area also equals the probability

• The total area under all the bars has to add to 1.

Doctor's Visits

0

200

400

600

800

1000

No. of Visits

Pro

bab

ilit

y 0

1

2

3+

Continuous Variables

Patient’s Weight Frequency

300 2

290 3

280 7

But… Can take on any value

• Can make the weight intervals as small as we want: every 10 lbs or 5 or 1, or … 0.5, 0.1, 0.001

• Histogram: As the intervals get smaller, the bars decrease in width

Line Graph

• Completely continuous, no width at all. Just connected points

Line Graph

0

10

20

30

40

50

60

70

80

90

100

Infinitesimally Small Intervals

• Then really just points on a smooth curve.

• We can also have n, the number of cases, increase to infinity.

• The total probability is still one.

Infinitesimally Small Intervals

Smoother Curve

Area under the curve = 1.

Probabilities

• Can no longer read the probability of a single event.

• In a continuous distribution, can only measure the probability of a value falling within some range

Probability Within a Range

Probability of a value falling within the range is equal to the area under the curve.

Bad News

• To calculate the area under the curve we would need to use calculus

• But not so bad news, others have done the calculations and set up tables for us

• Applause

Diversity of Continuous Distributions

• Lots of different distributions

• Lots of different shaped curves

• Would need lots of different tables, however….

The Most Important Distribution

Introducing the Normal Distribution

“Bell-Shaped Curve”

What are its characteristics?

Normal Distribution

• First described in 1754.

• A lot of the relevant math done by Carl Gauss, therefore “Gaussian Curve”

Properties

• Symmetrical about the mean

• Mean, Median & Mode are all equal

• Asymptotic, height never reaches zero.

• What’s the total area under the curve?

Ranges & Probabilities

• 50% of all values fall above the mean & 50% below it.

• All probabilities depend on how far the values lie from the mean

• Distance measured in number of standard deviations from the mean

Probabilities related to S.D.

One S.D. on either side of the mean

Area =

Other Distances

• 1 S.D. on either side of the mean includes 68% of the cases

• 2 S.D. on either side of the mean includes 95% of the cases

• 3 S.D. on either side of the mean includes 99.7% of the cases

Many Different Normal Distributions• Determined by their mean and standard

deviation

Mean gives location. Standard Deviation gives shape – more or less dispersed.

Proportions remain Same

• Relationships between probability and standard deviation are the same in all Normal Distributions

• However in order to use the tables provided, we have to convert to the “Standard Normal Distribution”

The Standard Normal DistributionMean = 0. Standard Deviation = 1.

Z-values

• Converts values in any normal distribution to the standard normal distribution.

• It’s a way to express the distance from the mean in units of S.D.

• Z = X – X Compare this to 18 eggs.

s.d. How many dozen?

From Z find ProbabilitiesUse Table A-3. Gives areas in the upper tail of the S.N.D.What is the area above Z = 1.28? Go to the Table. Go to 1.2 in Left-hand column & across to 0.08 A = 0.10. The probability that a value will fall above Z = 1.28 is 10%

S.N.D.

mean = 0. S.D. = 1

Test It

• Let’s look up the ones we already know.

• Range = 1 S.D. on either side of the mean

• Z = 1. Find 1.0 in the right hand column

• Go across to 0.00

• Reads 1.59. So area in the tail is 1.59.

• What’s the area between 1.59 and the mean?

A = .159

If Area above z = 1 is 0.159, what is the area between Z and the mean?

A = 0.500 - 0.159 = 0.341

We need to add an equal area on the other side of the mean.

Total shaded area = 0.682

Always draw the N.D.

You Try It

• What is the probability that a value will fall within 2 s.d. of the mean?

• Draw the N.D• Look up area that

corresponds to Z = 2.• A = 0.023• Find the area between

mean & Z = 2.• 0.500 – 0.023 = 0.477• Double it. A = 0.954

Try the Reverse

• I want to find the value above which 10% of the population falls.

• This time, area = 0.100

• Look in body of table for 0.100

• Read across and up. Z = 1.28

• Would have to use the formula for Z in reverse in order to get the value for X

Finding X

Z = X – X S.D.

1.28 = X – X S.D.

S.D. * 1.28 + X = X

To convert to X, have to know mean & S.D.

Example

• Weights of 40-yr old women are normally distributed with a mean of 150 and an S.D. of 10.

• What is the value above which the highest 10% of weights falls?

• X = 1.28 * 150 + 10 = 202

Application

• Studying a progressive neurological disorder. At autopsy, we weigh the brains. Find the wts are normally distributed with a mean of 1100 grams and an S.D. of 100 g.

• Find the probability that one of the brains weighs less than 850 g.

Draw the N.D.

Z = (800 – 1100)/100 = -3 P(X<800) = Area = 0.0001

1100800

The End

For Now

More Ranges

• The cholesterol levels for a certain population are approximately normally distributed with a mean of 200 mg/100 ml & an S.D. of 20 mg/100 ml.

• Find the probabilities for an individual picked at random to have cholesterol levels in the following ranges

Mean = 200 mg/100ml

S.D. = 20 mg/100 ml

B. Greater than 225A. Between 180 & 200

C. Between 190 & 210

• Z1 = 0. Z2 = (180 – 200)/20 = -1

So the area is from the mean to one S.D.

If it was both sides, would be .68. Since

only one side = 0.32. P = 0.32.

Mean = 200 mg/100ml

S.D. = 20 mg/100 ml A. Between 180 & 200

Mean = 200 mg/100mlS.D. = 20 mg/100 ml

• Z = (225 – 200)/20 = 1.25

• Look it up. Area = 0.106

• P(X>225) = 0.106

B. Greater than 225

Mean = 200 mg/100mlS.D. = 20 mg/100 ml

• Z1 = (190 – 200)/20 = -0.5 Look up = 0.309. But that is the tail. What is Z = 0.5 to mean? 0.500 – 0.309 = 0.191

• Z2 = 0.5. Symmetrical. So Z2 to the mean is also 0.191.

• P = 2 times 0.191 = 0.382

C. Between 190 & 210