14 descriptive statistics

20

Upload: onslow

Post on 14-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

14 Descriptive Statistics. 14.1Graphical Descriptions of Data 14.2Variables 14.3 Numerical Summaries 14.4Measures of Spread. Data Set. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 14  Descriptive Statistics
Page 2: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc.

14 Descriptive Statistics

14.1 Graphical Descriptions of Data14.2 Variables14.3 Numerical Summaries14.4 Measures of Spread

Page 3: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 3Copyright © 2010 Pearson Education, Inc.

A data set is a collection of data values. Statisticians often refer to the individual data values in a data set as data points. For the sake of simplicity, we will work with data sets in which each data point consists of a single number, but in more complicated settings, a single data point can consist of many numbers.

Data Set

Page 4: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 4Copyright © 2010 Pearson Education, Inc.

As usual, we will use the letter N to represent the size of the data set. In real- life applications, data sets can range in size from reasonably small (a dozen or so data points) to very large (hundreds of millions of data points), and the larger the data set is, the more we need a good way to describe and summarize it.

Data Set

Page 5: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 5Copyright © 2010 Pearson Education, Inc.

The day after the midterm exam in his Stat 101 class, Dr.Blackbeard has posted the results online. The data set consists of N = 75 data points (the number of students who took the test). Each data point (listed in the second column) is a score between 0 and 25 (Dr. Blackbeard gives no partial credit). Notice that the numbers listed in the first column are not data points–they are numerical IDs used as substitutes for names to protect the students’ rights of privacy.

Example 14.1 Stat 101 Test Scores

Page 6: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 6Copyright © 2010 Pearson Education, Inc.

Example 14.1 Stat 101 Test Scores

Page 7: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 7Copyright © 2010 Pearson Education, Inc.

Like students everywhere, the students in the Stat 101 class have one question foremost on their mind when they look at the results: How did I do? Each student can answer this question directly from the table. It’s the next question that is statistically much more interesting. How did the class as a whole do? To answer this last question, we will have to find a way to package the results into a compact, organized, and intelligible whole.

Example 14.1 Stat 101 Test Scores

Page 8: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 8Copyright © 2010 Pearson Education, Inc.

The first step in summarizing the information in Table 14-1 is to organize the scores in a frequency table such as Table 14-2. In this table, the number below each score gives the frequency of the score–that is, the number of students getting that particular score.

Example 14.2 Stat 101 Test Scores: Part 2

Page 9: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 9Copyright © 2010 Pearson Education, Inc.

We can readily see from Table 14-2 that there was one student with a score of 1, one with a score of 6, two with a score of 7, six with a score of 8, and so on. Notice that the scores with a frequency of zero are not listed in the table.

Example 14.2 Stat 101 Test Scores: Part 2

Page 10: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 10Copyright © 2010 Pearson Education, Inc.

We can do even better. Figure 14-1 (next slide) shows the same information in a much more visual way called a bar graph, with the test scores listed in increasing order on a horizontal axis and the frequency of each test score displayed by the height of the column above that test score. Notice that in the bar graph, even the test scores with a frequency of zero show up–there simply is no column above these scores.

Example 14.2 Stat 101 Test Scores: Part 2

Page 11: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 11Copyright © 2010 Pearson Education, Inc.

Figure 14-1

Example 14.2 Stat 101 Test Scores: Part 2

Page 12: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 12Copyright © 2010 Pearson Education, Inc.

Bar graphs are easy to read, and they are a nice way to present a good general picture of the data. With a bar graph, for example, it is easy to detect outliers–extreme data points that do not fit into the overall pattern of the data. In this example there are two obvious outliers–the score of 24 (head and shoulders above the rest of the class) and the score of 1 (lagging way behind the pack).

Example 14.2 Stat 101 Test Scores: Part 2

Page 13: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 13Copyright © 2010 Pearson Education, Inc.

Sometimes it is more convenient to express the bar graph in terms of relative frequencies –that is, the frequencies given in terms of percentages of the total population. Figure 14-2 shows a relative frequency bar graph for the Stat 101 data set. Notice that we indicated on the graph that we are dealing with percentages rather than total counts and that the size of the data set is N = 75.

Example 14.2 Stat 101 Test Scores: Part 2

Page 14: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 14Copyright © 2010 Pearson Education, Inc.

Figure 14-2

Example 14.2 Stat 101 Test Scores: Part 2

Page 15: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 15Copyright © 2010 Pearson Education, Inc.

This allows anyone who wishes to do so to compute the actual frequencies. For example, Fig. 14-2 indicates that 12% of the 75 students scored a 12 on the exam, so the actual frequency is given by 75 0.12 = 9 students.The change from actual frequencies to percentages (or vice versa) does not change the shape of the graph–it is basically a change of scale.

Example 14.2 Stat 101 Test Scores: Part 2

Page 16: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 16Copyright © 2010 Pearson Education, Inc.

Frequency charts that use icons or pictures instead of bars to show the frequencies are commonly referred to as pictograms. The point of a pictogram is that a graph is often used not only to inform but also to impress and persuade, and, in such cases, a well-chosen icon or picture can be a more effective tool than just a bar.Here’s a pictogram displaying the same data as in figure 14-2.

Bar Graph versus Pictogram

Page 17: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 17Copyright © 2010 Pearson Education, Inc.

Figure 14-3

Bar Graph versus Pictogram

Page 18: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 18Copyright © 2010 Pearson Education, Inc.

This figure is a pictogram showing the growth in yearly sales of the XYZ Corporation between 2001 and 2006. It’s a good picture to

Example 14.3 Selling the XYZ Corporation

show at a shareholders meeting, but the picture is actually quite misleading.

Page 19: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 19Copyright © 2010 Pearson Education, Inc.

This figure shows a pictogram for exactly the same data with a much more accurate and sobering picture of how well the XYZ

Example 14.3 Selling the XYZ Corporation

Corporation had been doing.

Page 20: 14  Descriptive Statistics

Excursions in Modern Mathematics, 7e: 14.1 - 20Copyright © 2010 Pearson Education, Inc.

The difference between the two pictograms can be attributed to a couple of standard tricks of the trade: (1) stretching the scale of the vertical axis and (2) “cheating” on the choice of starting value on the vertical axis. As an educated consumer, you should always be on the lookout for these tricks. In graphical descriptions of data, a fine line separates objectivity from propaganda.

Example 14.3 Selling the XYZ Corporation