means & medians chapter 4. parameter - fixed value about a population typical unknown

Download Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown

Post on 14-Jan-2016

215 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

  • Means & MediansChapter 4

  • Parameter -Fixed value about a populationTypical unknown

  • Statistic -Value calculated from a sample

  • Measures of Central TendencyMedian - the middle of the data; 50th percentileObservations must be in numerical orderIs the middle single value if n is oddThe average of the middle two values if n is evenNOTE: n denotes the sample size

  • Measures of Central TendencyMean - the arithmetic averageUse m to represent a population meanUse x to represent a sample meanFormula:

    S is the capital Greek letter sigma it means to sum the values that followparameterstatistic

  • Measures of Central TendencyMode the observation that occurs the most oftenCan be more than one modeIf all values occur only once there is no modeNot used as often as mean & median

  • Suppose we are interested in the number of lollipops that are bought at a certain store. A sample of 5 customers buys the following number of lollipops. Find the median.2 3 4 8 12The numbers are in order & n is odd so find the middle observation.The median is 4 lollipops!

  • Suppose we have sample of 6 customers that buy the following number of lollipops. The median is 2 3 4 6 8 12The numbers are in order & n is even so find the middle two observations.The median is 5 lollipops!Now, average these two values.5

  • Suppose we have sample of 6 customers that buy the following number of lollipops. Find the mean.2 3 4 6 8 12To find the mean number of lollipops add the observations and divide by n.

  • Using the calculator . . .

  • What would happen to the median & mean if the 12 lollipops were 20?2 3 4 6 8 20The median is . . .5The mean is . . .7.17What happened?

  • What would happen to the median & mean if the 20 lollipops were 50?2 3 4 6 8 50The median is . . .5The mean is . . .12.17What happened?

  • Resistant -Statistics that are not affected by outliers

    Is the median resistant?Is the mean resistant?

    YESNOIMPORTANT: Median is resistant to outliers Mean is NOT resistant to outliers

  • Now find how each observation deviates from the mean.What is the sum of the deviations from the mean?Look at the following data set. Find the mean.22232425252629300Will this sum always equal zero?YESThis is the deviation from the mean.

  • Look at the following data set. Find the mean & median.

    Mean =Median =212323242525262626272727272830303031323227Create a histogram with the data. (use x-scale of 2) Then find the mean and median.27Look at the placement of the mean and median in this symmetrical distribution.

  • Look at the following data set. Find the mean & median.

    Mean =Median =222928222425282125232423263638622325Create a histogram with the data. (use x-scale of 8) Then find the mean and median.28.176Look at the placement of the mean and median in this right skewed distribution.

  • Look at the following data set. Find the mean & median.

    Mean =Median =214654475360555560565858585862636458Create a histogram with the data. Then find the mean and median.54.588Look at the placement of the mean and median in this skewed left distribution.

  • Recap:In a symmetrical distribution, the mean and median are equal.In a skewed distribution, the mean is pulled in the direction of the skewness.In a symmetrical distribution, you should report the mean!In a skewed distribution, the median should be reported as the measure of center!

  • Example calculationsDuring a two week period 10 houses were sold in Fancytown.The average or mean price for this sample of 10 houses in Fancytown is $295,000

  • During a two week period 10 houses were sold in Lowtown.The average or mean price for this sample of 10 houses in Lowtown is $295,000

  • Looking at the dotplots of the samples for Fancytown and Lowtown we can see that the mean, $295,000 appears to accurately represent the center of the data for Fancytown, but it is not representative of the Lowtown data.Clearly, the mean can be greatly affected by the presence of even a single outlier.

  • In the previous example of the house prices in the sample of 10 houses from Lowtown, the mean was affected very strongly by the one house with the extremely high price.The other 9 houses had selling prices around $100,000.This illustrates that the mean can be very sensitive to a few extreme values.

    SOOOO

  • Describing the Center of a Data Set with the medianThe sample median is obtained by first ordering the n observations from smallest to largest (with any repeated values included, so that every sample observation appears in the ordered list). Then

  • Example of Median CalculationConsider the Fancytown data. First, we put the data in numerical increasing order to get 231,000 285,000 287,000 294,000 297,000 299,000 312,000 313,000 315,000 317,000Since there are 10 (even) data values, the median is the mean of the two values in the middle.

  • Consider the Lowtown data. We put the data in numerical increasing order to get 93,000 95,000 97,000 99,000 100,000 110,000 113,000 121,000 122,000 2,000,000Since there are 10 (even) data values, the median is the mean of the two values in the middle.

  • Typically, when a distribution is skewed positively, the mean is larger than the median,when a distribution is skewed negatively, the mean is smaller then the median, andwhen a distribution is symmetric, the mean and the median are equal.

  • Trimmed mean:Purpose is to remove outliers from a data setTo calculate a trimmed mean:Multiply the % to trim by nTruncate that many observations from BOTH ends of the distribution (when listed in order)Calculate the mean with the shortened data set

  • Find a 10% trimmed mean with the following data.

    1214192022242526263510%(10) = 1 So remove one observation from each side!

  • Example of Trimmed Mean

  • Example of Trimmed Mean

    Use scale of 2 on graphUse scale of 2 on graphUse scale of 2 on graph