gaiseing into the common core standards – day 1

GAISEing into the Common Core Standards – Day 1A Professional Development Seminar sponsored

by the Ann Arbor Chapter of the ASA

The CoreSummarize and describe distributions. Summarize, represent, and interpret data on a

single count or measurement variable CCSS.Math.Content.6.SP.B.4 Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

CCSS.Math.Content.HSS-ID.A.1 Represent data with plots on the real number line (dot plots, histograms, and box plots).

CCSS.Math.Content.6.SP.B.5 Summarize numerical data sets in relation to their context, such as by:CCSS.Math.Content.6.SP.B.5a Reporting the number of observations.CCSS.Math.Content.6.SP.B.5b Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.CCSS.Math.Content.6.SP.B.5c Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.

CCSS.Math.Content.HSS-ID.A.2 Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.

CCSS.Math.Content.6.SP.B.5d Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data were gathered.

CCSS.Math.Content.HSS-ID.A.3 Interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers).

http://www.corestandards.org/Math/Content/6/SP/B/4

http://www.corestandards.org/Math/Content/HSS/ID/A/1


http://www.corestandards.org/Math/Content/6/SP/B/5/a

http://www.corestandards.org/Math/Content/6/SP/B/5/b

http://www.corestandards.org/Math/Content/6/SP/B/5/c


http://www.corestandards.org/Math/Content/6/SP/B/5/d


Getting to Know You …Let’s Collect Some Data!

• Go around the room and enter data for yourself on the charts.

• Men: blue markers• Women: red markers

Let’s think about our data

• What are the different types of variables that we measured?

• How did you measure each of the variables?• Were any of these hard to measure?• What were the units for each variable?• What might the context be for each of these

charts?

Quantitative vs Categorical data• Height (in) • Number of letters in your first name• Number of siblings (not including yourself)• Favorite color• Do you currently have a dog?• How many pets do you currently have?• Travel time to this workshop? (min)• How many years have you been teaching?

Quantitative vs Categorical dataApplet: http://mathnstats.com/applets/Categorical-Quantitative.html

http://mathnstats.com/applets/Categorical-Quantitative.html

Quantitative vs Categorical data

• Common Misconceptions:– Histograms vs Bar charts – Don’t discuss shape

for bar charts!– Zip code?

Common Shapes of Distributions

Shape

• Skewed right(positive)/left(negative)

Matching Shapes and CharacteristicsDistribution 1 Distribution 2

Characteristic = Characteristic = Distribution 3 Distribution 4

Characteristic = Characteristic = Characteristics:

1. Distribution of age for the population of the United States in the year 1980. Describe and explain the shape of the distribution. 2. Distribution of miles of coastline for the 50 United States. Describe and explain the shape of the distribution. Which states do you think would be in the last class furthest to the right? 3. Distribution of the number of miles traveled to work, that is, commuting distance for employed adults in a city. Describe and explain the shape of the distribution. 4. Distribution of age at death for the population of the United States (year 1980). Describe and explain the shape of the distribution.

Measures of Center

• What is a typical value in a given situation?– Tallest bar: mode– Middle Value: median

• Median: differs for odd and even sample sizes– Show it on your hand!

Measures of Center• Mean:– Add and divide– “fair share”– Pencil activity– Block activity– Glass/beaker activity

Measures of CenterApplets for comparing medians and means:

http://onlinestatbook.com/stat_sim/descriptive/index.html

http://www.stat.tamu.edu/~west/ph/

http://bcs.whfreeman.com/ips4e/cat_010/applets/meanmedian.html

http://onlinestatbook.com/stat_sim/descriptive/index.html





http://bcs.whfreeman.com/ips4e/cat_010/applets/meanmedian.html

Measures of Center: Misconceptions

• When is the mean not a good measure of center?

• The mean doesn’t have to be a value in the data set.– The mean number of children per household

is 2.5 children!

Why we need measures of Spread?

Midterms are returned and the “average” was reported as 76 out of 100.

You received a score of 88.

How should you feel?

Measures of Spread

• Look at the data: discuss spread.• Range = Max – Min = Spread of 100% of data• Interquartile Range =

IQR = Q3 – Q1 = Spread of middle 50% of data– Needed for boxplots, it is the length of the box

Measures of Spread

• Mean Absolute Deviation– MAD– Average distance of values from the mean

• Standard Deviation– – Interpretation is similar to MAD

Increasing SpreadConsider the following three data sets.

I: 20 20 20 II: 18 20 22 III: 17 20 23

(a) Which data set will have the smallest standard deviation?

(b) Which data set will have the largest standard deviation?

(c) Find the standard deviation for each data set and check your answers to (a) and (b).

Increasing Spread ~ Instructor Side

Different Graphs, Same Data

Bin Sizes in HistogramsApplet: http://www.stat.sc.edu/~west/javahtml/Histogram.html

http://www.stat.sc.edu/~west/javahtml/Histogram.html

Boxplots and Symmetry

Back to the Core• CCSS.Math.Content.HSS-ID.A.4

Use mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve.


Normally Distributed Data

• Bell/Mound Shape– Symmetric– Mean ~ median

• Z-scores–

– Empirical Rule as Frame of Reference– Take them to calculator or table to get probability

Empirical RuleFor bell-shaped histograms, approximately …• 68% of values fall within 1 standard deviation

of mean in either direction.• 95% of values fall within 2 standard deviations of mean in

either direction.• 99.7% of values fall within 3 standard deviations of mean in

either direction.A very useful frame of reference!

Exam ScoresScores on final exam have approximately a bell-shaped distribution with a mean score of 70 points and a standard deviation of 10 points.Sketch a picture…

Exam ScoresScores on final exam have approximately a bell-shaped distribution with a mean score of 70 points and a standard deviation of 10 points.Suppose you scored 80 points on the exam.How many standard deviations from the mean is your score?

Standard Score or z-score

Empirical Rule (in terms of z-scores)For bell-shaped curves, approximately… • 68% of the values have z-scores between –1 and 1.• 95% of the values have z-scores between –2 and 2.• 99.7% of the values have z-scores between –3 and 3.

deviation standardmean valueobserved

z

Exam ScoresScores on final exam have approximately a bell-shaped distribution with a mean score of 70 points and a standard deviation of 10 points.Suppose Rob’s score was 2 standard deviations above the mean.

What was Rob’s score?

What can you say about the proportion of students who scored higher than Rob?

Check for Nonnormal Features

• Are these normal?

• Why/Why not?

Comparing Distributions

Comparing Distributions ~ Instructor Side

Are you a Good Timer?

• Quick Experiment:– Close your eyes– When you here the “START”,

begin counting off seconds in your head– When you here the “STOP”, write down the

number you reached

Are you a Good Timer?

• Come up and Graph the results• What do we see?

• Keep your result – we will revisit it later…

Back to the Core

Draw informal comparative inferences about two populations.

• CCSS.Math.Content.7.SP.B.3 Informally assess the degree of visual overlap of two numerical data distributions with similar variabilities, measuring the difference between the centers by expressing it as a multiple of a measure of variability. For example, the mean height of players on the basketball team is 10 cm greater than the mean height of players on the soccer team, about twice the variability (mean absolute deviation) on either team; on a dot plot, the separation between the two distributions of heights is noticeable.

• CCSS.Math.Content.7.SP.B.4 Use measures of center and measures of variability for numerical data from random samples to draw informal comparative inferences about two populations. For example, decide whether the words in a chapter of a seventh-grade science book are generally longer than the words in a chapter of a fourth-grade science book.

• CCSS.Math.Content.HSS-IC.B.5 Use data from a randomized experiment to compare two treatments; use simulations to decide if differences between parameters are significant.



http://www.corestandards.org/Math/Content/HSS/IC/B/5

Parallel Graphs• Use ideas

from before to compare:– Shape– Center– Spread

• Be sure to use same scale!

Parallel Graphs

What do you see?

Revisit the Timer Experiment

• How else might we explore this data?• What would be some interesting comparisons

to make?

• Website about parallel plots, you can enter data for 2+ groups and graphs made for you:

http://www.physics.csbsju.edu/stats/box2.html

http://www.physics.csbsju.edu/stats/box2.html

Balancing your Design

• Study collects data on which treatment group the subject was assigned, the main response (time to cure), and also other variables like age.

• They want to compare the responses for the two treatment groups, but are concerned that age might also be related to the response.

• Should check to see that age is balanced for the two treatment groups before looking for differences in the response by treatment.

Comparing Data: Usefulness of RandomizationStudy to compare two antibiotics for treating strep throat in children, Amoxicillin and Cefadroxil. At one center, 23 children were randomly assigned to one of two treatment groups. One concern is that age of the child might influence the effectiveness of the antibiotics. The ages of the children in each treatment group are given below. How do the two groups compare with respect to age?

Give the five-number summary for each group. Comment on your results. Amoxicillin Group (n=11): 8 9 9 10 10 11 11 12 14 14 17 Five-number summary: Cefadroxil Group (n=12): 7 8 9 9 9 10 10 11 12 13 14 16 Five-number summary: Make side-by-side boxplots for the antibiotic study data.

Comparing Data: Usefulness of Randomization ~ Instructor Side

Give the five-number summary for each of the two treatment groups. Comment on your results.

Amoxicillin Group (n=11): 8 9 9 10 10 11 11 12 14 14 17Five-number summary: min=8, Q1=9, median=11, Q3=14, max=17 Cefadroxil Group (n=12): 7 8 9 9 9 10 10 11 12 13 14 16Five-number summary: min=7, Q1=9, median=10, Q3=12.5, max=16

How long? 10 minutes How might it be done? Ask students to work through this exercise with a partner -- one person can do it for the Amoxicillin data and the other for the Cefadroxil data. Then discuss the results. You could have students start with all 23 children and perform the randomization themselves with a partner. Then each group will have a different answer and the class can see the effect of randomization overall. There may be a group for which the randomization did not do so well -- randomization does not guarantee balancing. How important? Important. It reinforces the concept of why randomization is a useful technique. A complete exercise for comparing two groups and assessing if the researchers need to control for age differences in evaluating the effectiveness of the two antibiotics.

16

14

12

10

8

6 Amoxicillin Cefadroxil

Which is Most Convincing?Study 1 Study 2 Study 3

Is there a Difference? Using Simulations

• Background of study here how this is taking many samples of size 10 from same population

Is there a Difference? Using Simulations

Experiment: measuring effect of caffeine (0 mg vs 200 mg) and deciding if have differing effects on number of finger taps per minute (2 hours later)

Resampling Applet

• Applet for resampling: http://lock5stat.com/statkey/

• We will see more of this on Day 3

http://lock5stat.com/statkey/

http://lock5stat.com/statkey/

Day 1 Wrap Up

• What surprised you today?• What did you find interesting?• How might you bring these ideas to your class?• What would you change?• Other activities/ideas to share with the group?

gaiseing into the common core standards – day 1

Documents

data distribution

different data sets

numerical data sets

display numerical data

median median

box plots

histograms vs bar charts

measurement variable