statistics 1. how long is a name? to answer this question, we might collect some data on the length...

Post on 12-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Statistics 1

How long is a name?

• To answer this question, we might collect some data on the length of a name.

How long is a name?

• First we need to establish our target population.

How long is a name?

• First we need to establish our target population.

• Let’s say in this mathematics class.

How long is a name?

• What names should we use?

How long is a name?

• What names should we use?

• Names as listed on the roll.

Data

Averaging

• We call this a central tendency.

• There are 3 measures which we can use.

• MEAN

• MEDIAN

• MODE

Mean

• Usually when we say average, we are referring to the mean.

• To find the mean, we add up all the numbers and divide by how many there are.

Example

• Find the mean of 4, 0, 2, 1, 6

In Excel we can use the formula

• =average(highlight cells)

Data on names

Median

• A median is the middle value when the data is put in order.

• If there are an odd number of data, the middle is unique.

• If there is an even number of data, we need to average the two middles.

Example

• Find the median of 4, 8, 2, 9, 1

• First put them in order

• 1, 2, 4, 8, 9

Example

• Find the median of 4, 8, 2, 9, 1

• First put them in order

• 1, 2, 4, 8, 9

• The middle number is ‘4’

Example

• Find the median of 4, 8, 2, 9, 1, 6

• First put them in order

• 1, 2, 4, 6, 8, 9

• The middle number is ‘4’ and ‘6’

• Averaging gives median is 5.

Sort data on Excel or use formula =median(data)

Mode

• The mode is the most common number.

• You can have 2 modes but not more than 2.

Example

• Find the mode of 6, 4, 3, 7, 8, 6, 7, 2

Example

• Find the mode of 6, 4, 3, 7, 8, 6, 7, 2

• There are two modes 6 and 7

Using Excel

• Formula =mode(data)

• You must be careful as Excel will only give one mode

Which average is the best?

• Generally we use the mean as it includes all the data but if we have extreme values, the median is a better measure as it is not affected by extreme values.

Example

• These are the incomes of a group of university students.

• $2400, $1500, $2000, $1800, $22 000

• Find the best ‘average’.

Example

• $2400, $1500, $2000, $1800, $22 000

• The mean is not representative whereas the median is.

Frequency tables

Length Tally Frequency

3 ll 2

4 llll 5

5 llll llll llll 14

6 llll ll 7

7 llll 5

8 ll 2

Mode is 5

Length Tally Frequency

3 ll 2

4 llll 5

5 llll llll llll 14

6 llll ll 7

7 llll 5

8 ll 2

Median is also 5

Length Tally Frequency

3 ll 2

4 llll 5

5 llll llll llll 14

6 llll ll 7

7 llll 5

8 ll 2

Mean is 5.4

Length Tally Frequency

3 ll 2

4 llll 5

5 llll llll llll 14

6 llll ll 7

7 llll 5

8 ll 2

Calculating the mean by hand

Using the calculator

• STAT mode• Place data in list 1• Place frequency in list 2• CALC, SET, • 1Var Xlist list1• 1Var Freq list2• Exe• 1Var

Measures of spread

• It is not enough to just give the ‘average’.

• The mean, median and mode is the same for all 3 sets of data:

• 48 49 50 50 51 52• 40 45 50 50 50 55 60• 0 0 50 50 50 100 100• But the data sets are quite different

Measures of spread

• Range is

• (highest number) - (lowest number)

• For our data set the first names have a range of 8 - 3 = 5

Measures of spread

• Again, if there are extreme values, the range can distort the true spread of the data.

5-number summary

• We often sort the data into a 5 number summary.

• The data is split into 4 groups

Example 1

• 1 14 29 35 43 48 49 78 82 82 92 95 95

• 13 numbers

Example 1

• 1 14 29 35 43 48 49 78 82 82 92 95 95

• Lowest is 1

• Median is 49

• Highest is 95

Example 1

• 1 14 29 35 43 48 49 78 82 82 92 95 95

• Lowest is 1

• Lower quartile is 35

• Median is 49

• Upper quartile is 82

• Highest is 95

Example 2

• 9 11 17 22 23 28 30 36

Example 2

• 9 11 17 22 23 28 30 3622.514

29

Example 2

• 9 11 17 22 23 28 30 36

• 5-number summary is

• 9 14 22.5 29 36

22.514

29

For first names in our class

• The 5-number summary is 3 4 5 6 8

• Lower quartile is 4

• Upper quartile is 6

• Interquartile range is the difference between quartiles 6 - 4 = 2

Statistics so far

• Central tendencies:

• Mean = 5.4

• Median = 5

• Mean = 5

• Because the mean and median are about the same, we wouldn’t expect extreme values.

Statistics so far

• Measures of spread:

• Range = 5

• Interquartile range = 2

Statistics so far

• 5 - number summary

• 3 4 5 6 8

top related