lecture 6 bell shaped curves. thought question 1: the heights of adult women in the united states...

54
Lecture 6 Bell Shaped Curves

Upload: caren-gibbs

Post on 17-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Lecture 6 Bell Shaped Curves
  • Slide 2
  • Thought Question 1: The heights of adult women in the United States follow, at least approximately, a bell-shaped curve. What do you think this means?
  • Slide 3
  • Thought Question 2: What does it mean to say that a mans weight is in the 30 th percentile for all adult males?
  • Slide 4
  • Thought Question 3: A standardized score is simply the number of standard deviations an individual falls above or below the mean for the whole group. Male heights have a mean of 70 inches and a standard deviation of 3 inches. Female heights have a mean of 65 inches and a standard deviation of 2 inches. Thus, a man who is 73 inches tall has a standardized score of 1. What is the standardized score corresponding to your own height?
  • Slide 5
  • Thought Question 4: Data sets consisting of physical measurements (heights, weights, lengths of bones, and so on) for adults of the same species and sex tend to follow a similar pattern. The pattern is that most individuals are clumped around the average, with numbers decreasing the farther values are from the average in either direction. Describe what shape a histogram of such measurements would have.
  • Slide 6
  • 8.1Populations, Frequency Curves, and Proportions Move from pictures and shapes of a set of data to Pictures and shapes for populations of measurements.
  • Slide 7
  • Note: Height of curve set so area under entire curve is 1. Frequency Curves Smoothed-out histogram by connecting tops of rectangles with smooth curve. Frequency curve for population of British male heights. The measurements follow a normal distribution (or a bell-shaped or Gaussian curve).
  • Slide 8
  • Frequency Curves Not all frequency curves are bell-shaped! Frequency curve for population of dollar amounts of car insurance damage claims. The measurements follow a right skewed distribution. Majority of claims were below $5,000, but there were occasionally a few extremely high claims.
  • Slide 9
  • Proportions Recall: Total area under frequency curve = 1 for 100% Mean British Height is 68.25 inches. Area to the right of the mean is 0.50. So about half of all British men are 68.25 inches or taller. Key: Proportion of population of measurements falling in a certain range = area under curve over that range. Tables will provide other areas under normal curves.
  • Slide 10
  • 8.2The Pervasiveness of Normal Curves Many populations of measurements follow approximately a normal curve: Physical measurements within a homogeneous population heights of male adults. Standard academic tests given to a large group SAT scores.
  • Slide 11
  • Normal Distribution Probability Probability is area under curve!
  • Slide 12
  • The height of a normal density curve at any point x is given by is the mean is the standard deviation Normal Distribution
  • Slide 13
  • Importance of Normal Distribution 1.Describes Many Random Processes or Continuous Phenomena 2.Basis for Classical Statistical Inference
  • Slide 14
  • Examples with approximate Normal distributions Height Weight IQ scores Standardized test scores Body temperature Repeated measurement of same quantity These distributions which are like generalised relative frequency histograms can take many different shapes, some symmetrical some skewed. There is one shape however that crops up all through the natural world and that is
  • Slide 15
  • The Normal Distribution is Symmetric. There are many different Normal curves, some are fat some are thin. Some are centred at 0 some at 1 some at 5 etc. Each normal curve can be uniquely identified by two parameters. The Mean and the Standard Deviation Once you know the mean and the S.Deviation for a Normal curve then it is possible to draw the curve. Normal curves are centred at the Mean. And the Standard Deviation describes how spread out they are.
  • Slide 16
  • A Normal Frequency Curve for the Population of SAT scores
  • Slide 17
  • The area under a Normal curve to the left of the mean is.5. This indicates that the probability that something which is normally distributed is less than its mean is.5. The area under the curve to the left of any point A on the X axis represents the probability that a Normal variable is less than A.
  • Slide 18
  • Slide 19
  • 8.3Percentiles and Standardized Scores Your percentile = the percentage of the population that falls below you. Finding percentiles for normal curves requires: Your own value. The mean for the population of values. The standard deviation for the population. Then any bell curve can be standardized so one table can be used to find percentiles.
  • Slide 20
  • Percentiles Example: Have you ever wondered what percentage of the population (of your gender) is taller than you are? Your percentile in a population represents the position of your measurement in comparison with everyone elses. It gives the percentage of the population that fall below you. For example, if you are in the 98 th percentile, it means that 98% of the population falls below you and only 2% is above you. Your percentile value is easy to find if the population of values has an approximate bell shape. Although there are an unlimited number of potential bell-shaped curves, each one can be completely determined once you know the mean and standard deviation of the population. In addition, each curve can be standardized in a way such that the same table can be used to find percentiles for any of them.
  • Slide 21
  • Infinite Number of Tables Normal distributions differ by mean & standard deviation. Each distribution would require its own table. Thats an infinite number!
  • Slide 22
  • Standardize the Normal Distribution One table! Normal Distribution Standardized Normal Distribution
  • Slide 23
  • The standardized score is often called the z-score. Once you know the z-score for an observed value, you can easily find the percentile corresponding to the observed value by using the table that gives the percentiles for a normal distribution with mean 0 and standard deviation 1. A normal curve with a mean of 0 and a standard deviation of 1 is called a standard normal curve. It is the curve that results when any normal curve is converted to standardized scores.
  • Slide 24
  • Standardized Scores Standardized Score (standard score or z-score): observed value mean standard deviation IQ scores have a normal distribution with a mean of 100 and a standard deviation of 16. Suppose your IQ score was 116. Standardized score = (116 100)/16 = +1 Your IQ is 1 standard deviation above the mean. Suppose your IQ score was 84. Standardized score = (84 100)/16 = 1 Your IQ is 1 standard deviation below the mean. A normal curve with mean = 0 and standard deviation = 1 is called a standard normal curve.
  • Slide 25
  • Table 8.1: Proportions and Percentiles for Standard Normal Scores
  • Slide 26
  • Finding a Percentile from an observed value: 1.Find the standardized score = (observed value mean)/s.d., where s.d. = standard deviation. Dont forget to keep the plus or minus sign. 2.Look up the percentile in Table 8.1. Suppose your IQ score was 116. Standardized score = (116 100)/16 = +1 Your IQ is 1 standard deviation above the mean. From Table 8.1 you would be at the 84 th percentile. Your IQ would be higher than that of 84% of the population.
  • Slide 27
  • Finding an Observed Value from a Percentile: 1.Look up the percentile in Table 8.1 and find the corresponding standardized score. 2.Compute observed value = mean +(standardized score)(s.d.), where s.d. = standard deviation. Jury urges mercy for mother who killed baby. The mother had an IQ lower than 98 percent of the population. (Scotsman, March 8, 1994,p. 2) Mother was in the 2 nd percentile. Table 8.1 gives her standardized score = 2.05, or 2.05 standard deviations below the mean of 100. Her IQ = 100 + (2.05)(16) = 100 32.8 = 67.2 or about 67. Example 1: Tragically Low IQ
  • Slide 28
  • The Standard Normal Table: 8.1 Table 8.1 is a table of areas under the standard normal density curve. The table entry for each value z is the area under the curve to the left of z.
  • Slide 29
  • The Standard Normal Table: Table A Table 8.1 can be used to find the proportion of observations of a variable which fall to the left of a specific value z if the variable follows a normal distribution.
  • Slide 30
  • Example 2: Calibrating Your GRE Score GRE Exams between 10/1/89 and 9/30/92 had mean verbal score of 497 and a standard deviation of 115. (ETS, 1993) Suppose your score was 650 and scores were bell-shaped. Standardized score = (650 497)/115 = +1.33. Table 8.1, z = 1.33 is between the 90 th and 91 st percentile. Your score was higher than about 90% of the population.
  • Slide 31
  • Example 3: Removing Moles Company Molegon: remove unwanted moles from gardens. Standardized score = (68 150)/56 = 1.46, and Standardized score = (211 150)/56 = +1.09. Table 8.1: 86% weigh 211 or less; 7% weigh 68 or less. About 86% 7% = 79% are within the legal limits. Weights of moles are approximately normal with a mean of 150 grams and a standard deviation of 56 grams. Only moles between 68 and 211 grams can be legally caught.
  • Slide 32
  • Standardizing Example Normal Distribution Standardized Normal Distribution
  • Slide 33
  • Some Examples Suppose it is know that verbal SAT scores are normally distributed with a mean of 500 and a standard deviation of 100. First we need to find the standardized score: Z-score=(observed value-mean)/(standard deviation) =(600-500)/100 = +1 Find the proportion of the population of SAT scores are less than or equal to 600. From Table 8.1 we see that a z-score of +1 is the 84 th percentile and the proportion of population SAT scores that are less than or equal to 600 is 0.84.
  • Slide 34
  • SAT SCORES
  • Slide 35
  • Standardized Scores (Z-Scores)
  • Slide 36
  • Estimate the proportion of population SAT scores that are greater than 400. First, we need to find the standardized score: z-score=(400-500)/100 = -1 From Table 8.1 we see that 16% of population values have a z-score less than or equal to -1 (or equivalently, 16% of population values have an observed score less than 400. However, we are interested in the proportion of the population with scores GREATER than 400. proportion ABOVE 400 = 1 - proportion BELOW 400 = 1 0.16 = 0.84
  • Slide 37
  • Slide 38
  • Estimate the proportion of population SAT scores that are between 400 and 600. An observed value of 400 has a z-score of -1 and represents the 16 th percentile (proportion below z = -1 is 0.16). An observed value of 600 has a z-score of +1 and represents the 84 th percentile (proportion below z = +1 is 0.84). Lets draw a picture.
  • Slide 39
  • So the proportion with scores between 400 and 600 =Proportion below 600 Proportion below 400 = 0.84 - 0.16 = 0.68
  • Slide 40
  • Find an SAT score such that 70% of the population had SAT scores less than or equal to this number (i.e., estimate the 70th percentile of the population). First we need to find the z-score that corresponds to the 70 th percentile. From Table 8.1 we see that this z-score is +0.52. Next we need to find the observed value (from the z- score): Observed value = mean + (z-score)*(standard deviation) = 500 + 0.52*100 = 552
  • Slide 41
  • 8.4z-Scores and Familiar Intervals Empirical Rule For any normal curve, approximately 68% of the values fall within 1 standard deviation of the mean in either direction 95% of the values fall within 2 standard deviations of the mean in either direction 99.7% of the values fall within 3 standard deviations of the mean in either direction A measurement would be an extreme outlier if it fell more than 3 s.d. above or below the mean.
  • Slide 42
  • The 68-95-99.7 Rule
  • Slide 43
  • The Empirical Rule Applet http://www.stat.sc.edu/~west/applets/empiri calrule.htmlhttp://www.stat.sc.edu/~west/applets/empiri calrule.html
  • Slide 44
  • Heights of Adult Women 68% of adult women are between 62.5 and 67.5 inches, 95% of adult women are between 60 and 70 inches, 99.7% of adult women are between 57.5 and 72.5 inches. Since adult women in U.S. have a mean height of 65 inches with a s.d. of 2.5 inches and heights are bell-shaped, approximately
  • Slide 45
  • For Those Who Like Formulas
  • Slide 46
  • Example In Tombstone, Arizona Territory people used Colt.45 revolvers. However people used different ammunition. Wyatt Earp knew that his brothers and Doc Holliday were the only ones in the territory who used Colt.45s with Winchester ammunition. The Earp brothers conducted tests on many different combinations of weapons and ammunition.They found that dataset of observations produced by the combination of Colt.45 with Winchester shells showed a Mean velocity of 936 feet/second and a Standard Deviation of 10 feet/second.
  • Slide 47
  • The measurements were taken at a distance of 15 feet from the gun. When Wyatt examined the body of a cowboy shot in the back in cold blood he concluded that he was shot at a distance of 15 feet and that the velocity of the bullet at impact was 1,000 feet/second. The dastardly Ike Clanton claimed that this cowboy was shot by the Earp brothers or Doc Holliday. Was Wyatt able to clear his good name using the Empirical Rule?
  • Slide 48
  • The distribution of this bullet velocity data should be approximately bell-shaped. This implies that the empirical rule should give a good estimation of the percentages of the data within each interval.
  • Slide 49
  • This table quite clearly demonstrates that since the bullet velocity in the shooting was 1000 ft/sec and since this lies more than 6 Standard Deviations away from the mean the probability is extremely high that the Earps were not responsible for this shooting. This is especially evident from looking at the column showing percentages from the empirical rule. Practically 100% of bullet velocities should be between 896 and 976 ft/sec.
  • Slide 50
  • Example P(3.8 X 5) Normal Distribution.0478 Standardized Normal Distribution Shaded area exaggerated
  • Slide 51
  • Example P(2.9 X 7.1) Normal Distribution.1664.1664.0832.0832 Standardized Normal Distribution Shaded area exaggerated
  • Slide 52
  • Example P(X 8) Normal Distribution Standardized Normal Distribution.1179.1179.5000.3821.3821 Shaded area exaggerated
  • Slide 53
  • Example P(7.1 X 8) Normal Distribution.0832.1179.0347.0347 Standardized Normal Distribution Shaded area exaggerated
  • Slide 54
  • Solution* P(2000 X 2400) Normal Distribution.4772.4772 Standardized Normal Distribution