statistics. population data: including data from all people or items with the characteristic one...

46
Chapter 5 Statistics

Upload: linette-reed

Post on 11-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Chapter 5Statistics

Page 2: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Types of Sampling

Page 3: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Population Versus Sample

• Population Data:• Including data from ALL people or items with the characteristic one wishes to understand.

• Sample Data:•Utilizing a set of data collected and/or selected from a statistical population by a defined procedure.

Page 4: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Population and Sample Data

• Can you think of examples?

• When would you use population data?• EX:

• When would you use sample data?• EX:

Page 5: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Sampling Methods

• 5 main methods:•Random Sampling• Systematic Sampling• Stratified Sampling•Cluster Sampling•Convenience Sampling

Page 6: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Random Sampling

• The “pick a name out of the hat” technique• Random number table• Random number generator

Hawkes and Marsh (2004)

Page 7: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Systematic Sampling

• All data is sequentially numbered• Every nth piece of data is chosen

Hawkes and Marsh (2004)

Page 8: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Stratified Sampling• Data is divided into

subgroups (strata)• Strata are based

specific characteristic • Age• Education level• Etc.

• Use random sampling within each strata

Hawkes and Marsh (2004)

Page 9: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Cluster Sampling

• Data is divided into clusters• Usually geographic

• Random sampling used to choose clusters• All data used from selected clusters

Hawkes and Marsh (2004)

Page 10: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Convience Sampling

• Data is chosen based on convenience• BE WARY OF BIAS!

Hawkes and Marsh (2004)

Page 11: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

BIAS

• Bias means how far from the true value the estimated value is.

• If a value has zero bias it is called unbiased.

• Why is this important in statistical studies?

Page 12: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

A Few Types of Bias

• Selection Bias• Omitted- Variable Bias• Funding Bias• Reporting/ Response Bias• Analytical Bias• Exclusion Bias

• Can you think of others?

Page 13: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Example 1: Sampling Methods

In a class of 18 students, 6 are chosen for an assignment

Sampling Type

Example

Random Pull 6 names out of a hat

Systematic Selecting every 3rd student

Stratified Divide the class into 2 equal age groups. Randomly choose 3 from each group

Cluster Divide the class into 6 groups of 3 students each. Randomly choose 2 groups

Convenience Take the 6 students closest to the teacher

Page 14: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TOC

Example 2: Utilizing Sampling Methods• Determine average student age• Sample of 10 students• Ages of 50 statistics students

18 21 42 32 17 18 18 18 19 22

25 24 23 25 18 18 19 19 20 21

19 29 22 17 21 20 20 24 36 18

17 19 19 23 25 21 19 21 24 27

21 22 19 18 25 23 24 17 19 20

Page 15: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 2 – Random Sampling• Random number

generator Data Point Location

Corresponding Data Value

35 25

48 17

37 19

14 25

47 24

4 32

33 19

35 25

34 23

3 42

Mean 25.1

(www.random.org)

Page 16: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 2 – Systematic Sampling• Take every data point

Data Point Location

Corresponding Data Value

5 17

10 22

15 18

20 21

25 21

30 18

35 21

40 27

45 23

50 20

Mean 20.8

5t

h

Page 17: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 2 – Convenience Sampling• Take the first 10

data points

Data Point Location

Corresponding Data Value

1 182 213 424 325 176 187 188 189 1910 22Mean 22.5

Page 18: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 2 - Comparison

25.1 20.8 22.5 21.7

Sampling Method vs. Average Age

Page 19: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

ASSIGNMENT!

• In a group of two or three, create a list of at least 3 pros and 3 cons for each type of sampling.

• In the same group, create a list of when you may use each type of sampling and for what reason.

• As a group determine which type of sampling is overall the best, and which is overall the easiest.

Page 20: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

The Mean

Page 21: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Helpful Vocabulary

• Measures of Central Tendency: Values that describe the center of distribution. The mean, median, and mode are 3 measures of central tendency.

• Mean: A measure of central tendency that is determined by dividing the sum of all values in a data set by the number of values.

• Frequency Distribution Table: A table that lists a group of data values, as well as the number of times each value appears in the data set.

• Outliers: Extreme values in a data set.

Page 22: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Mean Symbols for a Population

• µ pronounced ‘mu’•Symbols which represents the mean population

• ∑•Symbol which means ‘the sum of’– represents the addition of numbers

• N•Symbol which represents the number of data values of a given population

Page 23: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Mean for a Population

• In words:•Mean =

• In mathematical symbols:

• x1, x2, etc. are the given data values

Page 24: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Mean Symbols for a Sample

• pronounced ‘x bar’•Symbols which represents the sample mean

• ∑•Symbol which means ‘the sum of’– represents the addition of numbers

• n•Symbol which represents the number of data values of a given sample

Page 25: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Mean for a Sample

• In words:•Mean =

• In mathematical symbols:

• x1, x2, etc. are the given data values

Page 26: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 1

• Mark operates a donut business which has 8 employees. There ages are as follows: 55, 63, 34, 59, 29, 46, 51, 41.

• Find the mean age of the workers.• Which will we use? Population or

Sample? Why?

Page 27: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example 2

• The selling prices for the last 10 houses sold in a small town are listed below:•$125,000 $142,000 $129,500

$89,500 $105,000$144,000 $168,300 $96,000

$182,300 $212,000

• Calculate the mean selling price of the last 10 homes that were sold. Is this a population or sample?

Page 28: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Frequency Distribution Table• 60 students were asked how many books they had

read over the past 12 months. The results are listed in the frequency distribution table below. Calculate the mean number of books read by each student

Books Frequency

0 1

1 6

2 8

3 10

4 13

5 8

6 5

7 6

8 3

Page 29: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Create a Frequency Distribution

• The following data shows the heights in centimeters of a group of 10th grade students. Organize the data in a frequency distribution table and calculate the mean height of the students.

• 183 171 158 171 182 158 164 183179 170 182 183 170 171 167 176176 164 176 179 183 176 170 183183 167 167 176 171 182 179 170

Page 30: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Outliers

• The mean can be affected by extreme values or outliers.•Example:• If you are employed by a company that paid all of its employees a salary between $60,000 and $70,000 you could estimate the mean salary to be about $65,000. However if you add the $150,000 of the CEO then the mean would increase greatly.

Page 31: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TECHNOLOGY

• To calculate mean of a sample in the calculator:•STAT Edit Put in your data into L1 2nd Quit•STAT CALC 1-Var Stats Enter Enter

Page 32: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Technology

• Use technology to determine the mean of the following set of numbers:•24, 25, 25, 25, 26, 26, 27, 27, 28, 28, 31, 32

Page 33: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Technology Example

• In Tim’s school, there are 25 teachers. Each teacher travels to school every morning in his or her own car. The distribution of the driving times (in minutes) from home to school for the teachers is shown in the table below:

Driving Times Number of teachers

0 to 10 minutes 3

10 to 20 minutes 10

20 to 30 minutes 6

30 to 40 minutes 4

40 to 50 minutes 2

Page 34: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Technology Example2

• The following table shows the frequency distribution of the number of hours spent per week texting messages on a cell phone by 60 10th grade students at a local high school. Calculate the mean number of hours per week spent texting.

Time per Week (hours)

Number of Students

0 to less than 5 8

5 to less than 10 11

10 to less than 15 15

15 to less than 20 12

20 to less than 25 9

25 to less than 30 5

Page 35: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

The Median

Page 36: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Helpful Vocabulary

• Median: The value of the middle term in a set of organized data.• Cumulative Frequency: The sum of the

frequencies up to and including that frequency.

Page 37: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example

• Find the median of the following set of data:•12, 2, 16, 8, 14, 10, 6

• First organize the data from least to greatest.• Then find the middle number. When there are two middle numbers, take the two add them together and divide by 2.

Page 38: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example

• Find the median of the following data:•7, 9, 3, 4, 11, 1, 8, 6, 1, 4

Page 39: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example

• The amount of money spent by each of 15 high school girls for a prom dress is shown below. Find the median price of a prom dress.• $250 $175 $325

$195 $450 $300$275 $350 $425$150 $375 $300$400 $225 $360

Page 40: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TECHNOLOGY

• To calculate mean of a sample in the calculator:•STAT Edit Put in your data into L1 2nd Quit•STAT CALC 1-Var Stats Enter Enter•Scroll down to the Med button and this gives you the median of the data.

Page 41: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

TECHNOLOGY• The local police department spent the holiday

weekend ticketing drivers who were speeding. 50 locations within the state were targeted. The number of tickets issued druing the weekend in each of the locations is shown below. What is the median number of speeding tickets issued?

• 32 12 15 8 16 42 918 11 10 24 18 6 17 2141 3 5 35 27 13 26 1628 31 3 7 37 10 19 2333 7 25 36 40 15 21 3846 17 37 9 2 33 41 2329 19 40

Page 42: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

The Mode

Page 43: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Helpful Vocabulary

• Mode: The value or values that occur with the greatest frequency in a data set.• Unimodal: The term used to describe the

distribution of a data set that has only one mode.• Bimodal: The term used to describe the

distribution of a data set that has 2 modes.• Multimodal: The term used to describe the

distribution of a data set that has more than two modes.

Page 44: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example

• The posted speed limit along a busy highway is 65 miles per hour. The following values represent the speeds (in mph) of 10 cars that were stopped for violating the speed limit. Find the mode.• 76 81 79 80 78 83 77 79

82 75

• Is this unimodal, bimodal, or multimodal?

Page 45: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

Example

• The ages of 12 randomly selected customers at a local coffee shop are listed below. What is the mode of the ages?• 23 21 29 24 31 21 27 2324 32 33 19

• Is this unimodal, bimodal, or multimodal?

Page 46: Statistics. Population Data: Including data from ALL people or items with the characteristic one wishes to understand. Sample Data: Utilizing a set

•QUESTIONS???