chapter 9 normal distribution 9.1 continuous distribution 9.2 the normal distribution 9.3 a check...
Post on 15-Jan-2016
244 views
TRANSCRIPT
![Page 1: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/1.jpg)
Chapter 9 Normal Distribution
9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal
distribution 9.5 Normal approximation to
Binomial
![Page 2: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/2.jpg)
9.1 Continuous Distribution
For a discrete distribution, for example Binomial distribution with n=5, and p=0.4, the probability distribution is
x 0 1 2 3 4 5f(x) 0.07776 0.2592 0.3456 0.2304 0.0768 0.01024
![Page 3: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/3.jpg)
A probability histogram
x0 1 2 3 4 5
0.0
0.1
0.2
0.3
P(x)
![Page 4: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/4.jpg)
How to describe the distribution of a continuous random variable?
For continuous random variable, we also represent probabilities by areas—not by areas of rectangles, but by areas under continuous curves.
For continuous random variables, the place of histograms will be taken by continuous curves.
Imagine a histogram with narrower and narrower classes. Then we can get a curve by joining the top of the rectangles. This continuous curve is called a probability density (or probability distribution).
![Page 5: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/5.jpg)
Continuous distributions For any x, P(X=x)=0. (For a
continuous distribution, the area under a point is 0.)
Can’t use P(X=x) to describe the probability distribution of X
Instead, consider P(a≤X≤b)
![Page 6: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/6.jpg)
Density function
A curve f(x): f(x) ≥ 0 The area under the
curve is 1
P(a≤X≤b) is the area between a and b
0 2 4 6 8 10
x
0.00
0.05
0.10
0.15
0.20
0.25
y
![Page 7: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/7.jpg)
P(2≤X≤4)= P(2≤X<4)= P(2<X<4)
0 2 4 6 8 10
x
0.0
00
.05
0.1
00
.15
0.2
00
.25
y
![Page 8: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/8.jpg)
9.2 The normal distribution A normal curve: Bell shaped Density is given by
μand σ2 are two parameters: mean and standard variance of a normal population
(σ is the standard deviation)
2
2
1 ( )( ) exp
22
xf x
![Page 9: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/9.jpg)
The normal—Bell shaped curve: μ=100, σ2=10
90 95 100 105 110
x
0.0
00
.02
0.0
40
.06
0.0
80
.10
0.1
2
fx
![Page 10: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/10.jpg)
Normal curves:(μ=0, σ2=1) and (μ=5, σ 2=1)
-2 0 2 4 6 8
x
0.0
0.1
0.2
0.3
0.4
fx1
![Page 11: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/11.jpg)
Normal curves:(μ=0, σ2=1) and (μ=0, σ2=2)
-3 -2 -1 0 1 2 3
x
0.0
0.1
0.2
0.3
0.4
y
![Page 12: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/12.jpg)
Normal curves:(μ=0, σ2=1) and (μ=2, σ2=0.25)
-2 0 2 4 6 8
x
0.0
0.2
0.4
0.6
0.8
1.0
fx1
![Page 13: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/13.jpg)
The standard normal curve: μ=0, and σ2=1
-3 -2 -1 0 1 2 3
x
0.0
0.1
0.2
0.3
0.4
y
![Page 14: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/14.jpg)
How to calculate the probability of a normal random variable?
Each normal random variable, X, has a density function, say f(x) (it is a normal curve).
Probability P(a<X<b) is the area between a and b, under the normal curve f(x)
Table I in the back of the book gives areas for a standard normal curve with =0 and =1.
Probabilities for any normal curve (any and ) can be rewritten in terms of a standard normal curve.
![Page 15: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/15.jpg)
Table I: Normal-curve Areas Table I on page 494-495 We need it for tests Areas under standard normal curve Areas between 0 and z (z>0) How to get an area between a and
b? when a<b, and a, b positive area[0,b]–area[0,a]
![Page 16: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/16.jpg)
Get the probability from standard normal table
z denotes a standard normal random variable
Standard normal curve is symmetric about the origin 0
Draw a graph
![Page 17: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/17.jpg)
Table I: P(0<Z<z)
z .00 .01 .02 .03 .04 .05 .06 0.0 .0000 .0040 .0080 .0120 .0160 .0199 .02390.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 0.2 .0793 .0832 .0871 .0910 .0948 .0987 .10260.3 .1179 .1217 .1255 .1293 .1331 .1368 .1404 0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123… … … … … … … …1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770
![Page 18: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/18.jpg)
Examples
Example 9.1 P(0<Z<1)= 0.3413 Example 9.2 P(1<Z<2)=P(0<Z<2)–P(0<Z<1)=0.4772–0.3413=0.1359
Adobe Acrobat 7.0 Document
![Page 19: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/19.jpg)
Examples
Example 9.3 P(Z≥1) =0.5–P(0<Z<1) =0.5–0.3413 =0.1587
Adobe Acrobat 7.0 Document
![Page 20: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/20.jpg)
Examples
Example 9.4
P(Z ≥ -1)=0.3413+0.50=0.8413
Adobe Acrobat 7.0 Document
![Page 21: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/21.jpg)
Examples
Example 9.5
P(-2<Z<1)=0.4772+0.3413=0.8185
Adobe Acrobat 7.0 Document
![Page 22: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/22.jpg)
Examples
Example 9.6
P(Z ≤ 1.87)=0.5+P(0<Z ≤ 1.87)=0.5+0.4693=0.9693
Adobe Acrobat 7.0 Document
![Page 23: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/23.jpg)
Examples
Example 9.7
P(Z<-1.87)= P(Z>1.87)= 0.5–0.4693= 0.0307
Adobe Acrobat 7.0 Document
![Page 24: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/24.jpg)
From non-standard normal to standard normal
X is a normal random variable with mean μ, and standard deviation σ
Set Z=(X–μ)/σ Z=standard unit or z-score of X
Then Z has a standard normal distribution and
![Page 25: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/25.jpg)
Example 9.8
X is a normal random variablewith μ=120, and σ=15 Find the probability P(X≤135)Solution:
120
15120 120
015
15 1
15135 120
( 135) ( ) ( 1) 0.5 0.3413 0.841315
z
z
x xLet z
z is normal
xP x P P z
![Page 26: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/26.jpg)
XZ x z-score of xExample 9.8 (continued)
P(X≤150)x=150 z-score z=(150-120)/15=2 P(X≤150)=P(Z≤2)= 0.5+0.4772= 0.9772
![Page 27: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/27.jpg)
9.3Checking Normality Most of the statistical tools we will use in
this class assume normal distributions. In order to know if these are the right
tools for a particular job, we need to be able to assess if the data appear to have come from a normal population.
A normal plot gives a good visual check for normality.
![Page 28: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/28.jpg)
Simulation: 100 observations, normal with mean=5, st dev=1
x<-rnorm(100, mean=5, sd=1) qqnorm(x)
-2 -1 0 1 2
Quantiles of Standard Normal
23
45
67
8
x
![Page 29: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/29.jpg)
The plot below shows results on alpha-fetoprotein (AFP) levels in
maternal blood for normal and Down’s syndrome fetuses. Estim
atin
g a
wom
an’s risk o
f havin
g a
pre
ganan
cyasso
ciate
d w
ith D
ow
n’s sy
ndro
me u
sing h
er a
ge
an
d se
rum
alp
ha-fe
topro
tein
level
H.S
.Cu
ckle, N
.J.Wald
, S.O
.Th
om
pso
n
![Page 30: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/30.jpg)
Normal Plot
The way these normal plots work is Straight means that the data appear
normal Parallel means that the groups have
similar variances.
![Page 31: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/31.jpg)
Normal plot
In order to plot the data and check for normality, we compare
•our observed data to
•what we would expect from a sample of normal data.
![Page 32: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/32.jpg)
To begin with, imagine taking n=5 random values from a standard normal population (=0, =1)Let Z(1) Z(2) Z(3) Z(4) Z(5) be the ordered values. Suppose we do this over and over.
Sample Z(1) Z(2) Z(3) Z(4) Z(5)
1 -1.7 -0.2 0.8 1.3 1.92 -0.9 0.2 0.5 0.9 2.03 -2.3 -1.5 -0.6 0.4 1.3… … … … … …
Forever ___ ___ ___ ___ ___ Mean -1.163 -0.495 0 0.495 1.163
E(Z(1)) E(Z(2)) E(Z(3)) E(Z(4)) E(Z(5))
On average the smallest of n=5 standard normal values is 1.163
standard deviations below average
the second smallest of n=5 standard normal values is 0.495 standard deviations below average
the middle of n=5 standard normal values is at the average, 0 standard deviations from average
![Page 33: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/33.jpg)
The table of “rankits” from the Statistics in Biology table gives these expected values. For larger n, space is saved by just giving the positive values. The negative values are a mirror image of the positive values, since a standard normal distribution is symmetric about its mean of zero.
![Page 34: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/34.jpg)
Check for normalityIf X is normal, how do ordered values of X, X(i) , relate to expected ordered Z values, E( Z(i) ) ?
For normal with mean and standard deviation , the expected values of the data, X(i), will be a linear rescaling of standard normal expected values
E(X(i)) ≈ + E( Z(i) )
The observed data X(i) will be approximately a linearly related to E( Z(i) ).
X(i) ≈ + E( Z(i) )
ZXX
Z
![Page 35: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/35.jpg)
If we plot the ordered X values versus E( Z(i) ), we should see roughly a straight line with
•intercept
•slope
![Page 36: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/36.jpg)
ExampleExample: Lifetimes of springs under 900 N/mm2 stress
i E( Z(i) ) X(i) 1 -1.539 1532 -1.001 1623 -0.656 1894 -0.376 2165 -0.123 2166 0.1232167 0.3762258 0.6562259 1.001243
10 1.539306
![Page 37: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/37.jpg)
Lifetime of Springs at Stress 900
100
150
200
250
300
350
-2.000 -1.000 0.000 1.000 2.000
E(Z)
Lif
etim
e
900 stress
The plot is fairly linear indicating that the data arepretty similar to what we would expect from normal data.
![Page 38: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/38.jpg)
To compare results from different treatments, we can put more than one normal plot on the same graph.
100
150
200
250
300
350
-2.000 -1.000 0.000 1.000 2.000
E(Z)
Lif
etim
e950 stress
900 stress
The intercept for the 900 stress level is above the intercept for the 950 stress group, indicating that the mean lifetime of the 900 stress group is greater than the mean of the 950 stress group.
The slopes are similar, indicating that the variances or standard deviations are similar.
![Page 39: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/39.jpg)
These plots were done in Excel. In Excel you can either enter values from the table of E(Z) values or generate approximations to these tables values.
One way to generate approximate E(Z) values is to generate evenly spaced percentiles of a standard normal, Z, distribution.
The ordered X values correspond roughly to particular percentiles of a normal distribution.
For example if we had n=5 values, the 3rd ordered values would be roughly the median or 50th percentile.
A common method is to use percentiles corresponding to .
n
i 5.0100
![Page 40: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/40.jpg)
For n=5 this would give us
i
1 0.12 0.33 0.5 the 50th percentile4 0.75 0.9
For E(Z) we would use corresponding percentiles of a standard normal Z distribution.
Percentiles expressed as fractions are called quantiles. The 0.5 quantile is the 50th percentile.
Normal plots from this perspective are sometimes called Q-Q plots, since we are plotting standard normal quantiles versus the associated quantiles of the observed data.
n
i 5.0
![Page 41: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/41.jpg)
For n = 10 values for the spring data, the corresponding normal percentiles would bei Z quantile
1 0.05 -1.64 2 0.15 -1.04 3 0.25 -0.67 4 0.35 -0.39 5 0.45 -0.13 6 0.55 0.13 7 0.65 0.39 8 0.75 0.67 9 0.85 1.0410 0.95 1.64
n
i 5.0
![Page 42: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/42.jpg)
For assessing whether a plotted line is fairly parallel, either the E(Z) values or the normal quantiles work fine.
If you are doing the plot by hand it’s easiest to use the E(Z) table.
If you are doing these in Excel it’s easiest to use the normal quantiles.
The function NORMINV(p,0,1) finds the Z values corresponding to a given quantile.This is the inverse of the function that finds the cumulative probability for a given Z value.
Z NORMDIST probability = NORMDIST(1.645, 0, 1, TRUE) 0.95Probability NORMINV probability = NORMINV(0.95, 0, 1) 1.645
(The TRUE in NORMDIST says to return the cumulative probability rather than density curve height.)
![Page 43: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/43.jpg)
Normal Ordered Orderedn i (i-0.5)/n Quantile E(Z) 900 stress 950 stress
10 1 0.05 -1.645 -1.539 153 1172 0.15 -1.036 -1.001 162 1353 0.25 -0.674 -0.656 189 1354 0.35 -0.385 -0.376 216 1625 0.45 -0.126 -0.123 216 1626 0.55 0.126 0.123 216 1717 0.65 0.385 0.376 225 1898 0.75 0.674 0.656 225 1899 0.85 1.036 1.001 243 198
10 0.95 1.645 1.539 306 225
Excel File of Lifetime of Springs Data
![Page 44: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/44.jpg)
For data that are not normalMany types of data tend to follow a normal distribution, but many data sets aren’t particularly normal.If the data aren’t fairly normal we have several options
Transform the data, meaning change the scale. A log or ln scale is most common.
• Weights of fish• Concentrations
• Bilirubin levels in blood• pH is a log scale
• RNA expression levels in a microarray experiment A reciprocal (1/Y) change of times to rates Other powers
• Square root for Poisson variables
![Page 45: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/45.jpg)
![Page 46: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/46.jpg)
Non-normal data continued Use a different distribution other
than a normal distribution Weibull distribution for lifetimes
• Motors at General Electric• Patients in a clinical trial
![Page 47: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/47.jpg)
Weibull DistributionsWeibull Distributions(Time to Failure – Non-binomial & Non-(Time to Failure – Non-binomial & Non-normal)normal)
Infant Mortality: Fail immediately or last a long time
Early Failure: These do not fail immediately, but many do fail early
Old-age Wearout: Very few of these fail until they were out
![Page 48: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/48.jpg)
Non-normal data continued Use a nonparametric methods which
doesn’t assume any distribution Finding a distribution that models the
data well rather than nonparametric• Allows us to develop a more complete
model• Allows us to generalize to other situations• Gives us more precise information for the
same amount of effort
![Page 49: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/49.jpg)
The methods in this class largely apply to normal data or data that we can transform to normal.
The EPA fish example is a good example of transforming data with a log transformation.
Geometric means and harmonic means arise when we are working with transformed data.
For example fish weights are usually analyzed in the log scale. Having a mean in the log scale we want to put this value back into the original scale, for example grams.
The back-transformed mean from the log scale is the geometric mean.
The back-transformed mean from a reciprocal scale (rates), is the harmonic mean.
Back-transformed differences between geometric means correspond to ratios in the original scale.
![Page 50: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/50.jpg)
Suppose ln(X) = Y ~ N(, 2).
This means Y (or ln(x)) distributed as normal with mean and variance 2.
The geometric mean is e, the back-transformed population mean in the ln scale.
If we have the difference between two means in the ln scale then back-transforming give us
11 2
2
ee
e
= ratio of geometric means.
![Page 51: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/51.jpg)
About geometric means A fact is that if the variances of both
populations are the same, then the ratio of the population geometric means is the same as the ratio of the population means.
![Page 52: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/52.jpg)
Question: Why not just use the means in the original scale?
Answer: Means are best when populations are normal. Using the ratio of the geometric means will give us a more precise estimate of the true ratio than using the ratio of the means in the original scale.
![Page 53: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/53.jpg)
A similar fact explains why we use means rather than medians.
For a normal population the mean is the same as the median. We could use either the sample mean or the sample median to estimate .
BUT, the mean will be a more precise guess (estimate of) the true value, .
It would take us roughly 50% more values (larger n) using the median as our guess at to accomplish the same degree of precision as we get using the mean as our guess at .
![Page 54: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/54.jpg)
9.4 Application of the normal distribution
1960-62 Public Health Service Health Examination Survey 6,672 Americans 18-79 years old
The woman’s heights were approximately normal with 63 and standard deviation 2.5 .
What percentage of women were over 68 tall?
![Page 55: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/55.jpg)
Solution:
X=height
P(X>68)=P(Z>(68-63)/2.5)) =P(Z>2) =0.5-0.4772 =0.0228
![Page 56: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/56.jpg)
Continuity Correction for a Better Approximation
Sometimes only integer values are possible for x.
x=score of LSATx=# of heads in 10 tosses of a fair coinA normal approximation is more accurate with a “continuity correction”
![Page 57: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/57.jpg)
1976 LSAT Approximately normal mean 650, st. dev 60 P(X≥680)P(Z>(679.5-650)/60) =P(Z>0.49) =0.5-0.1879 =0.3121
![Page 58: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/58.jpg)
9.5 Normal Approximation to Binomial A binomial distribution: n=10, p=0.5 μ=np=5 σ2=np(1-p)=2.5 σ=1.581. P(X≥7)=0.172 from Binomial2. P(X≥7)= P(Z>(6.5-5)/1.58)3. =P(Z>0.95) =0.5-0.3289=0.1711 from normal approximation
![Page 59: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/59.jpg)
Dots: Binomial Probabilities
Smoot Line: Normal Curve With Same Mean and Variance
0 2 4 6 8 10
x
0.0
00
.05
0.1
00
.15
0.2
00
.25
fx
![Page 60: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/60.jpg)
Normal Approximation Is Good If
The normal curve has the same mean and standard deviation as binomial
np>5 and n(1-p)>5
Continuity correction is made
![Page 61: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/61.jpg)
Example
Records show that 60% of the customers of a service station pay with a credit card. Use normal approximation to find the probabilities that among 100 customers
1. At most 65 will pay with a credit2. At least 55 will pay with a credit3. Between 55 and 65 will pay with a credit
card4. Exactly 65 will pay with a credit card
![Page 62: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/62.jpg)
Solution:
X=# of customers who pay with a credit card
μ=np=60, σ2=np(1-p)=24 σ=4.8990
8686.03686.05.0)12.1(
)12.1()899.4
605.54()55(
8686.03686.05.0
)12.1()899.4
605.65()65(
ZP
ZPZPXP
ZPZPXP
![Page 63: Chapter 9 Normal Distribution 9.1 Continuous distribution 9.2 The normal distribution 9.3 A check for normality 9.4 Application of the normal distribution](https://reader030.vdocuments.mx/reader030/viewer/2022012901/56649d445503460f94a21173/html5/thumbnails/63.jpg)
Normal Approximation
3.
4.
7372.0)3686.0(2)12.112.1(
)899.4
605.65
899.4
605.54()6555(
ZP
ZPXP
0474.0
3212.03686.0)12.192.0(
)899.4
605.65
899.4
605.64(
)6565()65(
ZP
ZP
XPXP