1.1 displaying and describing categorical & quantitative data
DESCRIPTION
1.1 Displaying and Describing Categorical & Quantitative Data. You should be able to:. Recognize when a variable is categorical or quantitative Choose an appropriate display for a categorical variable and a quantitative variable - PowerPoint PPT PresentationTRANSCRIPT
1.1 Displaying and Describing Categorical & Quantitative Data
You should be able to:• Recognize when a variable is categorical or
quantitative• Choose an appropriate display for a categorical
variable and a quantitative variable• Summarize the distribution with a bar, pie chart, stem-
leaf plot, histogram, dot plot, box plots• Be able to describe the distribution of quantitative
variables in terms of its shape, center, spread, and outliers.
Types of Graphs
• Bar graph• Histogram• Dot plot• Stem leaf plot• Boxplots
Which graph to use?
• Depends on type of data– For categorigal you will typically
use either a bar or pie graph
– For quantitative you can use dotplot, stemplot, histogram, boxplot.
Categorical Data
• The objects being studied are grouped into categories based on some qualitative trait.
• Can use either use frequency (count) or relative frequency (percentages) to express data
ex- eye color, type of car you drive, gender, etc.
Categorical Data(Single Variable)
Eye Color BLUE BROWN GREEN
Frequency
(COUNTS)
20 50 5
Relative Frequency
20/75 =
.27
50/75=
.66
5/75=
.07
Pie Chart(Data is Counts or Percentages)
Eye Color
Blue , 20, 27%
Brown, 50, 66%
Green, 5, 7%
Blue Brown Green
Bar Graph(Shows distribution of data)
Eye Color
0
10
20
30
40
50
60
Blue Brown Green
Color
Fre
qu
en
cy
Blue
Brown
Green
Quantitative Variables
• Variables that are numerical. They represent a measurable quantity.
• Ex- person’s height, # of hamburgers sold each day of the week, speed of a car, pulse rate, etc
Dot Plot
• Summarizes quantitative data.• Horizontal axis represents measurement
scale.• Plot one dot for each data point.
Dot Plot
160150140130120110100908070Speed
Fastest Ever Driving Speed
Women126
Men100
226 Stat 100 Students, Fall '98
Stem-and-Leaf Plot
• Summarizes quantitative data.• Each data point is broken down into a “stem”
and a “leaf.”• First, “stems” are aligned in a column.• Then, “leaves” are attached to the stems.
Here are the scores from two periods of math class. Students took the same test.
Period 1: 77 79 85 58 97 94 82 81 75 63 60 92 75 98 83 58 72 57 70 81
Period 2: 57 60 88 85 79 70 65 98 97 59 58 65 62 77 77 75 73 69 82 81
Period 1: 76 79 85 58 97 94 82 81 75 63 60 92 75 98 83 58 72 57 70 81
Stem LeafNotice that the data (numerical facts) are numbers between 57-98. Create the stem by listing numbers from 5-9.
Notice that the data (numerical facts) are numbers between 57-98. Create the stem by listing numbers from 5-9.
5
6
7
8
9
Match up the data to the stem-and-leaf. The last digit in 76 will match up with the stem 7. Then the last digit in 79 will match up with the stem 7. Then the last digit in 85 will match up with the stem 8 and this pattern will continue until all data have been recorded in the stem-and-leaf.
Match up the data to the stem-and-leaf. The last digit in 76 will match up with the stem 7. Then the last digit in 79 will match up with the stem 7. Then the last digit in 85 will match up with the stem 8 and this pattern will continue until all data have been recorded in the stem-and-leaf.
6 9
5
8
7 4
2 1
5
3 0
2
5
8
3
8
2
7
0
1
Stem Leaf
5
6
7
8
9
7 7 8 0 3
0 2 5 5 6 9
1 1 2 3 5
2 4 7 8
Rearrange the leaf in numerical order from least to greatest
Rearrange the leaf in numerical order from least to greatestKey: 7 9 means 79Key: 7 9 means 79
A key should be included when making a stem-and-leaf plot.
A key should be included when making a stem-and-leaf plot.
Period 2: 57 60 88 85 79 70 65 98 97 59 58 65 62 77 77 75 73 69 82 81
Stem Leaf5
6
7
8
9
7 8 9 0 2 5 5 9
0 3 5 7 7 9
1 2 5 8
7 8
Key: 7 9 means 79Key: 7 9 means 79
Histogram
• Divide measurement up into equal-sized categories (BIN WIDTH)
• Determine number (or percentage) of measurements falling into each category.
• Draw a bar for each category so bars’ heights represent number (or percent) falling into the categories.
• Label and title appropriately.
Use common sense in determining number of
categories to use.
Between 5 & 15 intervals is preferable
Histogram
Too few categories
18 23 28
0
10
20
30
40
50
60
Age (in years)
Fre
quency (
Count)
Age of Spring 1998 Stat 250 Students
n=92 students
Too many categories
2 3 4
0
1
2
3
4
5
6
7
GPA
Fre
quen
cy (
Co
unt)
GPAs of Spring 1998 Stat 250 Students
n=92 students
Histogram
18 19 20 21 22 23 24 25 26 27
0
10
20
30
40
50
Age (in years)
Fre
quency (
Count)
Age of Spring 1998 Stat 250 Students
n=92 students
Strengths and Weaknesses of Graphs for Quantitative Data
• Histograms– Uses intervals– Good to judge the “shape” of a data– Not good for small data sets
• Stem-Leaf Plots– Good for sorting data (find the median)– Not good for large data sets
Strengths and Weaknesses of Graphs for Quantitative Data
• Dotplots– Uses individual data points– Good to show general descriptions of
center and variation– Not good for judging shape for large data sets
Summary
• Many possible types of graphs.• Use common sense in reading graphs.• When creating graphs, don’t summarize your
data too much or too little.• When creating graphs, label everything for
others. Remember you are trying to communicate something to others!