1 1 slide © university of minnesota-duluth, summer 2009-econ-2030(dr. tadesse) chapter 2...
Post on 19-Dec-2015
214 views
TRANSCRIPT
1 1 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Chapter 2
Descriptive Statistics
Descriptive Statistics
Learning Objectives:1. Learn how to construct (procedures) and interpret summarized
qualitative and quantitative data using : frequency and relative frequency distributions, bar graphs and pie charts, a dot plot, a histogram, and an ogive
2. Learn how the shape of a data distribution (negatively skewed, symmetric, and positively skewed).
3 3 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Descriptive Statistics:
Tabular and Graphical Presentations
Part A
Exploratory Data AnalysisCross-tabulations; Scatter
DiagramsPart-B
4 4 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Tabular and Graphical Presentations:
Visual Description of Data
Part A
5 5 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Tabular and Graphical Procedures
Qualitative DataQualitative Data Quantitative DataQuantitative Data
TabularMethods TabularMethods
TabularMethods TabularMethods
Graphical MethodsGraphical Methods
Graphical MethodsGraphical Methods
• Frequency Distribution• Rel. Freq. Dist.• Percent Freq. Distribution• Crosstabulation
• Bar Graph• Pie Chart
• Frequency Distribution• Rel. Freq. Dist.• Cum. Freq. Dist.• Cum. Rel. Freq. Distribution • Stem-and-Leaf Display• Crosstabulation
• Dot Plot• Histogram• Ogive• Scatter Diagram
Two Types of DataTwo Types of Data
Overview of Tabular and Graphical Methods
Sum
marizin
g D
ata
6 6 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Summarizing Qualitative Data
7 7 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.1) Summarizing Qualitative Data
1. Frequency Distribution2. Relative Frequency Distribution 3. Percent Frequency Distribution
4. Bar Graph5. Pie Chart
Graphical Methods
Tabular Methods
8 8 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
The objective is to provide insights about the data that cannot be quickly obtained by looking at the original data.
The objective is to provide insights about the data that cannot be quickly obtained by looking at the original data.
2.1) Summarizing Qualitative Data
9 9 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
A frequency distribution is a tabular summary of data showing the frequency (or number) of items presented in several non-overlapping classes.
A frequency distribution is a tabular summary of data showing the frequency (or number) of items presented in several non-overlapping classes.
2.1. Frequency Distribution
10 10 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Example: Days Inn
Guests staying at Days Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below
average, orpoor. The ratings provided by a sample of 20
guests are:
Below Average Above Average Above Average Average Above Average Average Above Average
Average Above Average Below Average Poor Excellent Above Average Average
Above Average Above Average Below Average Poor Above Average Average
11 11 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Frequency Distribution
PoorBelow AverageAverageAbove AverageExcellent
2 3 5 9 1
Total 20
Rating Frequency
12 12 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
The relative frequency is the fraction or proportion of the total number of data items belonging to a given class (category).
The relative frequency is the fraction or proportion of the total number of data items belonging to a given class (category).
A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class.
A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class.
Relative Frequency Distribution
13 13 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Percent Frequency Distribution
The percent frequency of a class is the relative frequency multiplied by 100. The percent frequency of a class is the relative frequency multiplied by 100.
A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class.
A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class.
14 14 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Relative Frequency andPercent Frequency Distributions
PoorBelow AverageAverageAbove AverageExcellent
.10 .15 .25 .45 .05
Total 1.00
10 15 25 45 5 100
RelativeFrequency
PercentFrequencyRating
.10(100) = 10
1/20 = .05
15 15 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Bar Graph
A bar graph is a graphical device for depicting qualitative data.
On one axis (usually the horizontal axis), we specify the labels that represent each of class (Category). A frequency, relative frequency, or percent frequency scale can be used for the other axis (usually the vertical axis).
Using a bar of fixed width drawn above each class label, we extend the height appropriately.
The bars are separated to emphasize the fact that each class is a separate category.
16 16 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Recall The Frequency Distribution for Days Inn
PoorBelow AverageAverageAbove AverageExcellent
2 3 5 9 1
Total 20
Rating Frequency
17 17 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
PoorPoor BelowAverageBelow
AverageAverageAverage Above
Average Above
AverageExcellentExcellent
Fre
qu
en
cy
Fre
qu
en
cy
RatingRating
Bar Graph
1122
33
44
55
66
77
88
991010 Days Inn Quality Ratings
18 18 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Pie Chart
The pie chart is a commonly used graphical device for presenting relative frequency distributions for qualitative data.
First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class.
Example: As there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.
19 19 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Recall The Relative Frequency for Days Inn
PoorBelow AverageAverageAbove AverageExcellent
.10 .15 .25 .45 .05
Total 1.00
RelativeFrequencyRating
20 20 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
BelowAverage 15%
BelowAverage 15%
Average 25%Average 25%
AboveAverage 45%
AboveAverage 45%
Poor10%Poor10%
Excellent 5%Excellent 5%
Days Inn Quality RatingsDays Inn Quality Ratings
Pie Chart
21 21 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Some Insights about Days Inn—from the Preceding Pie Chart
Example: Days Inn
• One-half of the customers surveyed gave Days Inn a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might please the manager.
• For each customer who gave an “excellent” rating, there were two customers who gave a “poor” rating (looking at the top of the pie). This should displease the manager.
22 22 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Summarizing Quantitative Data
Summarizing Data
23 23 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2) Summarizing Quantitative Data
1. Frequency Distribution2. Relative Frequency and Percent Frequency
Distributions
3. Dot Plot4. Histogram5. Cumulative Distributions6. Ogive
Tabular Methods
Graphical Methods
24 24 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Three Important things to remember when working on Frequency Distribution for Quantitative Data:
1. Determine the number of non-overlapping classes (categories)—Quantitative data do not naturally come in categories
2. Determine the Width of each class (Equal class widths are preferred)
3. Specify the Class Limits: (a given observation should belong to one and only one class)
2.2.1) Frequency Distribution
25 25 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.1) Frequency Distribution
In Determining the Number of CLASSES
• Use a thumb rule of 5 to 20 classes.
• Larger data sets usually require a largernumber of classes.
• Smaller data sets usually require fewer classes
26 26 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.1) Frequency Distribution
To Determine the WIDTH of Classes:
Largest Data Value Smallest Data ValueNumber of Classes
27 27 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Example: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-ups
91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73
28 28 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.1) Frequency Distribution
For Hudson Auto Repair, if we choose six classes:
50-59
60-69 70-79 80-89 90-99
100-109
2 13 16 7 7 5Total 50
Parts Cost ($) Frequency
Approximate Class Width = (109 - 52)/6 = 9.5 10
29 29 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.2) Relative Frequency andPercent Frequency Distributions
50-59
60-69 70-79 80-89 90-99
100-109
PartsCost ($)
.04 .26 .32 .14 .14 .10
Total 1.00
RelativeFrequency
4 26 32 14 14 10
100
Percent Frequency
2/50 .04(100)
30 30 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
• Only 4% of the parts costs are in the $50-59 class.
• The greatest percentage (32% or almost one-third) of the parts costs are in the $70-79 class.
• 30% of the parts costs are under $70.
• 10% of the parts costs are $100 or more.
Some Insights ….on the Parts Cost Data
2.2.2) Relative Frequency andPercent Frequency Distributions
31 31 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.3) Dot Plot
Is one of the simplest graphical summaries of data. It has two components:
A horizontal axis that shows the range of data values.
And a dot placed above the axis, that represents each data value.
32 32 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
50 60 70 80 90 100 11050 60 70 80 90 100 110
Cost ($)Cost ($)
Tune-up Parts Cost
2.2.3) Dot Plot
33 33 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.4) Histogram
Histogram is a common graphical presentation of quantitative data .
In describing data using a Histogram, place the variable of interest on the horizontal axis.
Then we draw a rectangle above each class interval with its height corresponding to the interval’s frequency, relative frequency, or percent frequency.
Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes.
34 34 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.4) Histogram
22
44
66
88
1010
1212
1414
1616
1818
PartsCost ($) PartsCost ($)
Fre
qu
en
cy
Fre
qu
en
cy
50-59 60-69 70-79 80-89 90-99 100-11050-59 60-69 70-79 80-89 90-99 100-110
Tune-up Parts Cost
35 35 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Shapes of Histogram
Depending upon the data set, we may see different
shapes when we summarize a data using histogram.
36 36 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
1. Symmetric• Left tail is the mirror image of the right tail• Examples: heights and weights of people
Various Shapes of Histogram
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
37 37 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Various Shapes of Histogram
2. Moderately Skewed Left• A longer tail to the left• Example: exam scoresR
ela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
38 38 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
3. Moderately Right Skewed• A Longer tail to the right• Example: housing values
Various Shapes of Histogram
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
39 39 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Various Shapes of Histogram
4. Highly Skewed Right• A very long tail to the right• Example: executive salaries
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
40 40 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
1. Cumulative frequency distribution-shows the number of items with values less than or equal to the upper limit of each class..
1. Cumulative frequency distribution-shows the number of items with values less than or equal to the upper limit of each class..
2. Cumulative relative frequency distribution – shows the proportion of items with values less thanor equal to the upper limit of each class.
2. Cumulative relative frequency distribution – shows the proportion of items with values less thanor equal to the upper limit of each class.
2.2.5) Cumulative Distributions
3. Cumulative percent frequency distribution – shows the percentage of items with values less than or equal to the upper limit of each class.
3. Cumulative percent frequency distribution – shows the percentage of items with values less than or equal to the upper limit of each class.
41 41 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
Example
Recall the Hudson Auto Repair Data:
50-59
60-69 70-79 80-89 90-99
100-109
2 13 16 7 7 5Total 50
Parts Cost ($) Frequency
42 42 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.5) Cumulative Distributions
Hudson Auto Repair
< 59
< 69 < 79 < 89 < 99
< 109
Cost ($) CumulativeFrequency
CumulativeRelative
Frequency
CumulativePercent
Frequency
2 15 31 38 45
50
.04 .30 .62 .76 .90
1.00
4 30 62 76 90
100
2 + 13
15/50 .30(100)
43 43 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
2.2.6) Ogive
An ogive is a graph of a cumulative distribution. Can be constructed using one of the following
Shown on the vertical axis are the:• cumulative frequencies, or• cumulative relative frequencies, or• cumulative percent frequencies
The plotted points are connected by straight lines.
44 44 Slide
Slide
© University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse)
PartsCost ($) PartsCost ($)
2020
4040
6060
8080
100100
Cu
mu
lati
ve P
erc
en
t Fr
eq
uen
cyC
um
ula
tive P
erc
en
t Fr
eq
uen
cy
50 60 70 80 90 100 11050 60 70 80 90 100 110
(89.5, 76)
Example: Ogive with Cumulative Percent Frequencies
Tune-up Parts CostTune-up Parts Cost