m07-numerical summaries 1 1 department of ism, university of alabama, 1995-2003 lesson objectives ...

45
M07-Numerical Summaries 1 1 tment of ISM, University of Alabama, 1995-2003 Lesson Objectives earn when each measure of a “typical value” is appropriate. lso called “central tendency” or “locat earn when each measure of variation” are appropriate. lso called “scatter” or “dispersion.” ee how these measures relate to tatistical inference, which will covere ater in the course.

Upload: ami-wilkins

Post on 13-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 1 Department of ISM, University of Alabama, 1995-2003

Lesson Objectives

Learn when each measure of a “typical value” is appropriate.Also called “central tendency” or “location.”

Learn when each measure ofa “variation” are appropriate.Also called “scatter” or “dispersion.”

See how these measures relate to statistical inference, which will coveredlater in the course.

Page 2: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 2 Department of ISM, University of Alabama, 1995-2003

Statistics is the science of

• collecting

• organizing

• summarizing

• interpreting

DATA

for making decisions.

Page 3: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 3 Department of ISM, University of Alabama, 1995-2003

Organize / SummarizeOrganize / Summarize Data Data

GraphicalGraphical NumericalNumerical

Page 4: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 4 Department of ISM, University of Alabama, 1995-2003

Key Features of Data Distributions

Shape

Typical Value

Spread

Outliers

This sectioncovers thesetwo.

Page 5: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 5 Department of ISM, University of Alabama, 1995-2003

Measures of Location

Give “middle” or “typical” valuesor “central tendency.”

Measures of VariationDescribe “spread” or “scatter”or “dispersion” in the data.

Page 6: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 6 Department of ISM, University of Alabama, 1995-2003

Measures of Location

1. Meanthe “center of gravity”

of the data (histogram).

Page 7: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 7 Department of ISM, University of Alabama, 1995-2003

formula for mean

SampleMean =

Sum of observationsdivided by

sample size

Xi

nX =

X1 + X2 + ··· +Xn

n=

Page 8: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 8 Department of ISM, University of Alabama, 1995-2003

The mean is ________________

to extreme values (outliers).

Page 9: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 9 Department of ISM, University of Alabama, 1995-2003

2. Median - midpoint of distribution

At least half of the observations

are at or less than the median,

and at least half are

at or greater than the median.

Page 10: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 10 Department of ISM, University of Alabama, 1995-2003

Note: For n observations,the median is located at the

n + 1

2

in the ordered sample.

-th observation

Page 11: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 11 Department of ISM, University of Alabama, 1995-2003

Example 1

Data: 14, 18, 20, 12, 24, 15, 14 (n = 7 “odd”)

7 + 12

= 4th location of median

Median is the middle value of the “ordered” data.

At least half the values are at or greater; at least half are at or lower.

Page 12: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 12 Department of ISM, University of Alabama, 1995-2003

median example

Data: 14, 18, 20, 12, 24, 15, 14 (n = 7 “odd”) 94 (outlier)

Original, Original, X = X = with outlier, X =with outlier, X =

Example 2

Median is still the middle value.

Median is resistant to outliers.

Page 13: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 13 Department of ISM, University of Alabama, 1995-2003

Data: 14, 18, 20, 12, 24, 15, 14, 214 (n = 8 “even,” outlier)

Median is the average of the two middle values.

Exactly half the values are greater, half lower.

Example 3

8 + 12

= 4.5th location of median

Page 14: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 14 Department of ISM, University of Alabama, 1995-2003

1. Order the data.

2. For odd n, the median is the center observation.

3. For even n, the median is the average of the two center observations.

Summary for finding Median

Page 15: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 15 Department of ISM, University of Alabama, 1995-2003

3. Mode - most frequently occurring number

In a histogram, modal class is the one having largest frequency,

i.e., highest bar.

Page 16: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 16 Department of ISM, University of Alabama, 1995-2003

If categorical, use the mode. “Average” is meaningless; look at “percentages” of occurrences.

If variable is quantitative, first look at a graph:

Skewed or outliers? Skewed or outliers?

More or less symmetric? More or less symmetric?

Use medianUse median..

Use meanUse mean..

When should each estimator be used?

What type of variable is it?

Page 17: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 17 Department of ISM, University of Alabama, 1995-2003

Numerical Summary

Location Variation

MeanMedianMode

RangeStd. DeviationIQR

Page 18: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 18 Department of ISM, University of Alabama, 1995-2003

Mountain Climbing Rope.Two suppliers; sample and

test three ropes from each.

“Snap Breaking Strength”

Why does variation matter?

Page 19: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 19 Department of ISM, University of Alabama, 1995-2003

Measures of Variation

1. Range

2. Variance & Standard Deviation

3. Mean Absolute Deviation (Mad)

4. Interquartile Range (IQR)

Page 20: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 20 Department of ISM, University of Alabama, 1995-2003

Highest minus lowest value in the sample.

1. Range

Page 21: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 21 Department of ISM, University of Alabama, 1995-2003

Example 4: 3, 4, 1, 7, 4,

5

1 2 3 4 5 6 7

Example 5: 1, 1, 1, 7, 7,

7

1 2 3 4 5 6 7

Range =

Range =

Page 22: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 22 Department of ISM, University of Alabama, 1995-2003

Advantage: _________

_________________.

Disadvantage:

_______ most of the data.

______________ to outliers.

Range

Page 23: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 23 Department of ISM, University of Alabama, 1995-2003

How far are the data from the middle, on average?

2. Variance & Standard Deviation

Sample Variance = s2

Sample Std. Dev. = sPopulation Variance = 2

Population Std. Dev. =

Notation:

Page 24: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 24 Department of ISM, University of Alabama, 1995-2003

Example 4: 3, 4, 1, 7, 4, 5

1 2 3 4 5 6 7

Page 25: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 25 Department of ISM, University of Alabama, 1995-2003

We need to keep the negatives from canceling the positives.

We can do this by 1. _____________, ______

2. _____________, _____

Note: The average of the deviations from the mean will always be zero.

Page 26: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 26 Department of ISM, University of Alabama, 1995-2003

Equation for Variance:

2 = N

(Xi - )2

(see page 88)

For a population:

s2 = n - 1

(Xi - X)2

For a sample:

Page 27: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 27 Department of ISM, University of Alabama, 1995-2003

Equation for Variance:

s2 = n - 1

(Xi - X)2(see page 88)

=

= units?

Example 4 data:

(3-4)2 + (4-4)2 + (1-4)2 + (7-4)2 + (4-4)2 + (5-4)2

6 - 1=

Page 28: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 28 Department of ISM, University of Alabama, 1995-2003

Equations for Variance:

(see page 88)s2 = n - 1

(Xi - X)2

(see page 90)sX Xn

n 12 i

2 2 =

sX

( X )n

n 12 i

2 i2

=

1.

2.

3.

Page 29: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 29 Department of ISM, University of Alabama, 1995-2003

Example 4: 3, 4, 1, 7, 4, 5X3417

4 524

X9

161

491625

116

2

X =

X2 =

X =

X2 =

Page 30: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 30 Department of ISM, University of Alabama, 1995-2003

sX

( X )n

n 12

i2 i

2

=s2

6 - 1= 4.0

Page 31: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 31 Department of ISM, University of Alabama, 1995-2003

• Both equations shouldgive the same answer.

• First is easier when data and the mean are integers.

• Second is easier for larger data sets, or data not integer.

• More chance of round-off errorwith first equation.

Comments

Page 32: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 32 Department of ISM, University of Alabama, 1995-2003

Advantage: ________________; ________________.

Disadvantages: Units are _________.

____ resistant to outliers.

Variance

Page 33: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 33 Department of ISM, University of Alabama, 1995-2003

Standard DeviationS = S

2 “The square root of the variance.”

= 4.0

= 2.0

Advantage: Easier to interpret than variance, Units same as data.

Page 34: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 34 Department of ISM, University of Alabama, 1995-2003

3. Mean Absolute Deviation, MAD

MAD = xi – N

This will be used extensively in OM 300

for population data

MAD = xi – x n

for sample data

(see page 87)

Page 35: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

M07-Numerical Summaries 1 35 Department of ISM, University of Alabama, 1995-2003

IQR = Q - Q 13

IQR is the range of the middle 50% of the data.

Observations more than 1.5 IQR’s beyond quartiles are considered outliers.

4. Interquartile Range (IQR)

Mor

e on

this

late

r.

Page 36: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 36 Department of ISM, University of Alabama, 1995-2003

Statistical Inference

Generalizing from a sample to a population,

by using a statisticto estimate

a parameter.

Page 37: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 37 Department of ISM, University of Alabama, 1995-2003

ParameterParameterStatisticStatistic

Mean:

Standard deviation:

Proportion:

s

X

estimates

estimates

estimatesp

from sample

from entire population

Page 38: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 38 Department of ISM, University of Alabama, 1995-2003

Descriptive

NumericalGraphical

Statistics

Page 39: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 39 Department of ISM, University of Alabama, 1995-2003

Estimate the true mean net weight of Estimate the true mean net weight of 16 oz. bags of Golden Flake Potato Chips 16 oz. bags of Golden Flake Potato Chips with a 95% confidence interval. with a 95% confidence interval.

16.0516.0115.9215.6816.1016.0115.7215.8016.2115.70

15.9516.2416.0215.9016.0716.0516.1815.4516.0416.05

Measured Weights in ounces.

Use MinitabUse Minitab

Is the filling machinedoing what it shouldbe doing?

Is the filling machinedoing what it shouldbe doing?

(Not real (Not real

data)data)

Example 5:

Page 40: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 40 Department of ISM, University of Alabama, 1995-2003

Data window

name of worksheet file

Most commonlyused features.

Session window

Page 41: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 41 Department of ISM, University of Alabama, 1995-2003

“Stat”

“Basic Statistics ”

“Display descriptive statistics”

Page 42: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 42 Department of ISM, University of Alabama, 1995-2003

Page 43: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

C07-Numerical Summaries 1 43 Department of ISM, University of Alabama, 1995-2003

Results for: c07 Weight of chips.MTW

Descriptive Statistics: Weights

Variable N Mean Median TrMean StDev SE Mean

Weights 20 15.958 16.015 15.970 0.199 0.045

Variable Minimum Maximum Q1 Q3

Weights 15.450 16.240 15.825 16.065

Executing from file: C:\Program Files\MTBWIN\MACROS\Describe.MAC

Descriptive Statistics Graph: Weights

“Session Window” results“Session Window” results

“Five number” summary

Page 44: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

Histogram with Normal distribution

curve superimposed

Box plot

“95% Confidence Interval”

for the population mean.

Page 45: M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, 1995-2003 Lesson Objectives  Learn when each measure of a “typical value” is appropriate

____, because 16.000 is a plausible value for the truepopulation mean.

____, because 16.000 is a plausible value for the truepopulation mean.

A confidence interval gives the limits of the plausible values of the true population mean, .

A confidence interval gives the limits of the plausible values of the true population mean, .

Our sample mean was 15.957 oz.This is less than 16.000.Should we be concerned?

Our sample mean was 15.957 oz.This is less than 16.000.Should we be concerned?

“95% Confidence Interval”

for the population mean.