copyright © 2010 pearson education, inc. slide 5 - 1

22
Slide 5 - 1 Copyright © 2010 Pearson Education, Inc.

Upload: sofia-griffith

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 1Copyright © 2010 Pearson Education, Inc.

Page 2: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 2Copyright © 2010 Pearson Education, Inc.

Solution: A

Page 3: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Copyright © 2010 Pearson Education, Inc.

Chapter 5Understanding and Comparing Distributions

Page 4: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 4Copyright © 2010 Pearson Education, Inc.

The Big Picture Below is a histogram of the Average Wind Speed for every

day in 1989. Describe the data.

Page 5: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 5Copyright © 2010 Pearson Education, Inc.

The Big Picture (cont.)

The distribution is unimodal and skewed to the right.

The high value may be an outlier Median daily wind

speed is about 1.90 mph and the IQR is reported to be 1.78 mph.

Page 6: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 6Copyright © 2010 Pearson Education, Inc.

The Five-Number Summary

The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). Example: The five-

number summary for the daily wind speed is:

Max 8.67

Q3 2.93

Median 1.90

Q1 1.15

Min 0.20

Page 7: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 7Copyright © 2010 Pearson Education, Inc.

Daily Wind Speed: Making Boxplots

A boxplot is a graphical display of the five-number summary.

Boxplots are useful when comparing groups. Boxplots are particularly good at pointing out

outliers.

Page 8: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 8Copyright © 2010 Pearson Education, Inc.

Constructing Boxplots

1. Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box.

Page 9: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 9Copyright © 2010 Pearson Education, Inc.

Constructing Boxplots (cont.)

2. Erect “fences” around the main part of the data.

The upper fence is 1.5 IQRs above the upper quartile.

The lower fence is 1.5 IQRs below the lower quartile.

Note: the fences only help with constructing the boxplot and should not appear in the final display.

Page 10: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 10Copyright © 2010 Pearson Education, Inc.

Constructing Boxplots (cont.)

3. Use the fences to grow “whiskers.”

Draw lines from the ends of the box up and down to the most extreme data values found within the fences.

If a data value falls outside one of the fences, we do not connect it with a whisker.

Page 11: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 11Copyright © 2010 Pearson Education, Inc.

Constructing Boxplots (cont.)

4. Add the outliers by displaying any data values beyond the fences with special symbols.

We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles.

Page 12: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 12Copyright © 2010 Pearson Education, Inc.

Wind Speed: Making Boxplots (cont.)

Compare the histogram and boxplot for daily wind speeds:

How does each display represent the distribution?

Page 13: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 13Copyright © 2010 Pearson Education, Inc.

Comparing Groups

It is almost always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of

the two distributions.

What does this graphical display tell you?

Page 14: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 14Copyright © 2010 Pearson Education, Inc.

Comparing Groups (cont.) Boxplots offer an ideal balance of information and simplicity, hiding

the details while displaying the overall summary information. We often plot them side by side for groups or categories we wish

to compare.

What do these boxplots tell you?

Page 15: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 15Copyright © 2010 Pearson Education, Inc.

What About Outliers?

If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with the outliers removed. The differences may be quite revealing.

Note: The median and IQR are not likely to be affected by the outliers.

Page 16: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 16Copyright © 2010 Pearson Education, Inc.

Timeplots

For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data.

Page 17: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 17Copyright © 2010 Pearson Education, Inc.

Re-expressing Skewed Data to Improve Symmetry

When the data are skewed it can be hard to summarize them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail.

How can we say anything useful about such data?

Page 18: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 18Copyright © 2010 Pearson Education, Inc.

Re-expressing Skewed Data to Improve Symmetry (cont.)

One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function).

Note the change in skewness from the raw data (previous slide) to the transformed data (right):

Page 19: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 19Copyright © 2010 Pearson Education, Inc.

What Can Go Wrong? (cont.)

Avoid inconsistent scales, either within the display or when comparing two displays.

Label clearly so a reader knows what the plot displays. Good intentions, bad

plot:

Page 20: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 20Copyright © 2010 Pearson Education, Inc.

What Can Go Wrong? (cont.)

Beware of outliers

Be careful when comparing groups that have very different spreads. Consider these

side-by-side boxplots of cotinine levels:

Re-express . . .

Page 21: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 21Copyright © 2010 Pearson Education, Inc.

TI Tips Boxplots

Boys: 22, 17, 18, 29, 22, 22, 23, 24, 23, 17, 21 Girls: 25, 20, 12, 19, 28, 24, 22, 21, 25, 26, 25, 16, 27, 22 Enter data into L1 (Boys) and L2 (Girls) STATPLOT, plot1 Turn the plot On Choose the first boxplot icon (you want your boxplot to

indicate outliers) Specify: Xlist:L1 and Freq: 1 and select the Mark you want

the calculator to use for outliers. Use Zoomstat to display. You can trace to see the five

number summary. Set up Plot2 to display the girls data. Zoomstat will show both.

Page 22: Copyright © 2010 Pearson Education, Inc. Slide 5 - 1

Slide 5 - 22Copyright © 2010 Pearson Education, Inc.

Homework: Pg. 95 5-29 (odd) (skip 11,25)