chapter 1 – exploring data yms - 1.1 displaying distributions with graphs xii-7
TRANSCRIPT
Chapter 1 – Exploring DataChapter 1 – Exploring Data
YMS - 1.1YMS - 1.1
Displaying Distributions with Displaying Distributions with GraphsGraphs
xii-7
Consider This….Consider This….
Data beat anecdotes - seatbeltsData beat anecdotes - seatbelts
Lurking variables – Simpson’s ParadoxLurking variables – Simpson’s Paradox
Origin of data – Ann LandersOrigin of data – Ann Landers
VariationVariation
3 W’s – Who, What, Why? 3 W’s – Who, What, Why?
xii-7
VocabularyVocabularyData Data Numbers with a contextNumbers with a context
IndividualsIndividuals Objects described by a set of dataObjects described by a set of data
Variable Variable Characteristic of an individualCharacteristic of an individual
Categorical Variable – places individual into one of Categorical Variable – places individual into one of several groups or categoriesseveral groups or categoriesQuantitative Variable – take values for which Quantitative Variable – take values for which arithmetic operations make sensearithmetic operations make sense
xii-7
p7 #1.1 – 1.4p7 #1.1 – 1.4
CaseCase Row of data (all variables of an individual)Row of data (all variables of an individual)
DistributionDistribution Pattern of variation of a variablePattern of variation of a variable What values the variable takes and how oftenWhat values the variable takes and how often
Exploratory Data AnalysisExploratory Data Analysis Statistical tools and ideas that help you Statistical tools and ideas that help you
examine data in order to describe their main examine data in order to describe their main featuresfeatures
xii-7
CountCount
PercentPercent
OutlierOutlier
Overall Pattern of a Distribution (SOCS)Overall Pattern of a Distribution (SOCS) Shape. Outlier. Center. Spread.Shape. Outlier. Center. Spread. Write 2-3 sentences in context with Write 2-3 sentences in context with
appropriate measures.appropriate measures.
8-17
Types of GraphsTypes of Graphs1.1. Bar Graph Bar Graph
Leave a space between the barsLeave a space between the bars Label the category names at equally spaced Label the category names at equally spaced
intervals beneath the horizontal axisintervals beneath the horizontal axis
2.2. Pie Chart Pie Chart Must add up to 100%Must add up to 100% Let the computer create itLet the computer create it
3.3. Dotplot Dotplot Mark a dot above number on horizontal axis Mark a dot above number on horizontal axis
corresponding to each data valuecorresponding to each data value
8-17
4.4. Stemplot Stemplot Stems and leaves are arranged in increasing Stems and leaves are arranged in increasing
orderorder Include legendInclude legend Split stems if necessary (0-4 and 5-9)Split stems if necessary (0-4 and 5-9) Round or truncate when necessary Round or truncate when necessary
p15 Technology Toolboxp15 Technology Toolbox
GreedGreed
8-17
The Game of GreedThe Game of GreedEveryone stands.Everyone stands.
A pair of dice will be thrown by a classmate. After each A pair of dice will be thrown by a classmate. After each toss you have the option to sit and keep the score (the toss you have the option to sit and keep the score (the total on the dice) or stand and continue onto the next total on the dice) or stand and continue onto the next round.round.
The game is over when everyone has decided to sit OR The game is over when everyone has decided to sit OR when a two is thrown (not snake eyes - just the number when a two is thrown (not snake eyes - just the number 2). If you're standing when a 2 is thrown, your score for 2). If you're standing when a 2 is thrown, your score for the round is zero.the round is zero.
A game consists of 5 rounds. At the end of the game, add A game consists of 5 rounds. At the end of the game, add
your 5 scores to get your total.your 5 scores to get your total.
HW: p16 #1.8 & 1.9 HW: p16 #1.8 & 1.9
Activity: Legendary WSActivity: Legendary WS
More VocabMore VocabSymmetricSymmetric If the right and left sides of a distribution are If the right and left sides of a distribution are
mirror images of each othermirror images of each other
Right/Left SkewedRight/Left Skewed Values are stretched to the right/leftValues are stretched to the right/left
PercentilePercentile The value such that The value such that p p percent of the percent of the
observations fall at or below itobservations fall at or below it
18-34
5.5. Time plotsTime plotsPlots each observation against the time at Plots each observation against the time at which it is measuredwhich it is measured
Trend - a long-term upward or downward Trend - a long-term upward or downward movement over timemovement over time
Seasonal Variation - a pattern that repeats Seasonal Variation - a pattern that repeats itself at regular time intervals itself at regular time intervals
18-34
More GraphsMore Graphs
6.6. Histogram Histogram Graphs the distribution of one quantitative Graphs the distribution of one quantitative
variablevariable Precise intervalsPrecise intervals Intervals must be kept at same widthIntervals must be kept at same width Can use percentages instead of countsCan use percentages instead of counts
18-34
p22 #1.12 – calculator – zoom statp22 #1.12 – calculator – zoom statp27 #1.16 – reading a histogramp27 #1.16 – reading a histogram
7.7. O-Jive aka Relative Cumulative O-Jive aka Relative Cumulative Frequency GraphFrequency Graph
Make table with class, frequency, relative Make table with class, frequency, relative frequency, cumulative frequency, and frequency, cumulative frequency, and relative cumulative frequencyrelative cumulative frequency
Plot a point corresponding to the relative Plot a point corresponding to the relative cumulative frequency in each class cumulative frequency in each class interval at the left endpoint of the next interval at the left endpoint of the next class interval class interval
p31 #1.19p31 #1.19
18-34
HW: #1.20, 1.26 & 1.28 HW: #1.20, 1.26 & 1.28
Meet in the lab 557 tomorrow.Meet in the lab 557 tomorrow.
YMS - 1.2YMS - 1.2
Describing Distributions with Describing Distributions with NumbersNumbers
Measures of CenterMeasures of CenterMeanMean Add all values and divide by the number of Add all values and divide by the number of
observationsobservations Not a resistant measure of centerNot a resistant measure of center
Median Median Midpoint of a distribution; 50th percentileMidpoint of a distribution; 50th percentile All values must be arranged in increasing order All values must be arranged in increasing order
before finding medianbefore finding median Median is a resistant measureMedian is a resistant measure
Mean vs. MedianMean vs. Median When to useWhen to use In skewed distributionsIn skewed distributions
#1.34-1.35 on p41#1.34-1.35 on p41
37-47
Range Range Difference between largest and smallest value of a Difference between largest and smallest value of a
distributiondistribution
Quartiles Quartiles 25th and 75th percentiles25th and 75th percentiles
Interquartile Range Interquartile Range The distance between the first and third quartilesThe distance between the first and third quartiles
Modified BoxplotsModified Boxplots Shows the outliers Shows the outliers Always use this one!Always use this one!
37-47
Boxplots and VocabBoxplots and Vocab
OutliersOutliers1.5 x IQR Rule1.5 x IQR Rule
1 3 3 5 7 10 11 11 11 15 251 3 3 5 7 10 11 11 11 15 25
#1.36 on p47#1.36 on p47#1.39 on p48#1.39 on p48
37-47
Measures of SpreadMeasures of SpreadStandard Deviation Standard Deviation How far are the observations from their meanHow far are the observations from their mean The larger the standard deviation, the wider the The larger the standard deviation, the wider the
distributiondistribution Is the square root of the varianceIs the square root of the variance Is not a resistant measure Is not a resistant measure
Variance Variance Average of the square of the deviations of the Average of the square of the deviations of the
observations from their meanobservations from their mean Has a different unit of measurement than standard Has a different unit of measurement than standard
deviationdeviation
#1.40 and #1.43 on p52#1.40 and #1.43 on p5249-53
Degrees of Freedom = Degrees of Freedom = n n -1-1
What measures to useWhat measures to use Mean and Standard DeviationMean and Standard Deviation
Reasonably symmetric distributions that are free of Reasonably symmetric distributions that are free of outliersoutliers
5-number summary 5-number summary Skewed distributions or ones with strong outliersSkewed distributions or ones with strong outliers
Would you rather have a 10% raise or a Would you rather have a 10% raise or a $1000 raise?$1000 raise?
49-53
Effect of a Linear Transformation Effect of a Linear Transformation xxnewnew = a + bx = a + bx
Fathom First DayFathom First DayMultiplying by constant Multiplying by constant bb Multiplies both measures of center and Multiplies both measures of center and
spread by constant spread by constant bb..
Adding the same number Adding the same number aa Adds Adds aa to measures of center and to quartiles to measures of center and to quartiles Does not change measures of spreadDoes not change measures of spread
Transformations do not change the shape Transformations do not change the shape of a distribution of a distribution
53-66
Use back to back stemplots or boxplotsUse back to back stemplots or boxplots
Easy to do in Fathom!Easy to do in Fathom!
Example 1.17 on p57Example 1.17 on p57
Comparing Distributions Comparing Distributions
53-66