s1.2 calculating means and standard deviations -...

26
© Boardworks Ltd 2005 1 of 26 © Boardworks Ltd 2005 1 of 26 AS-Level Maths: Statistics 1 for Edexcel S1.2 Calculating means and standard deviations This icon indicates the slide contains activities created in Flash. These activities are not editable. For more detailed instructions, see the Getting Started presentation.

Upload: ngohanh

Post on 29-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

© Boardworks Ltd 20051 of 26 © Boardworks Ltd 20051 of 26

AS-Level Maths:Statistics 1for Edexcel

S1.2 Calculating means and standard deviations

This icon indicates the slide contains activities created in Flash. These activities are not editable.

For more detailed instructions, see the Getting Started presentation.

© Boardworks Ltd 20052 of 26

Co

nte

nts

© Boardworks Ltd 20052 of 26

Means

Calculating means

Calculating standard deviations

Coding

© Boardworks Ltd 20053 of 26

The mean is the most widely used average in statistics. It is

found by adding up all the values in the data and dividing by

how many values there are.

, , ,...,1 2 3 nx x x x

...1 2 3 inxx x x x

xn n

Note: The mean takes into account every piece of

data, so it is affected by outliers in the data. The

median is preferred over the mean if the data

contains outliers or is skewed.

Mean

Notation: If the data values are , then the

mean is

This is the

mean symbol

This symbol

means the

total of all the

x values

© Boardworks Ltd 20054 of 26

If data are presented in a frequency table:

Mean

Value Frequency

… …

2x

nx

1x 1f

2f

nf

...1 1 2 2 i in n

i i

x fx f x f x fx

f f

then the mean is

© Boardworks Ltd 20055 of 26

Example: The table shows the results of a survey

into household size. Find the mean size.

Mean

Household size, x Frequency, f

1 20

2 28

3 25

4 19

5 16

6 6

To find the mean, we add a 3rd column to the table.

x × f

20

56

75

76

80

36

TOTAL 114 343

Mean = 343 ÷ 114 = 3.01

© Boardworks Ltd 20056 of 26

Co

nte

nts

© Boardworks Ltd 20056 of 26

Standard deviation

Calculating means

Calculating standard deviations

Coding

© Boardworks Ltd 20057 of 26

There are three commonly used measures of spread (or

dispersion) – the range, the inter-quartile range and the

standard deviation.

( )2

varianceix x

n

( )

2

s.d.ix x

n

Standard deviation

The following formulae can be used to find the variance and s.d.

variance = (standard deviation)2

The variance is related to the standard deviation:

The standard deviation is widely used in statistics to measure

spread. It is based on all the values in the data, so it is

sensitive to the presence of outliers in the data.

© Boardworks Ltd 20058 of 26

Total: 22

Example: The mid-day temperatures (in °C) recorded for

one week in June were: 21, 23, 24, 19, 19, 20, 21

( )2

varianceix x

n

Standard deviation

...21 23 21 14721

7 7x

21 0 0

23 2 4

24 3 9

19 -2 4

19 -2 4

20 -1 1

21 0 0

( )2

ix xix xix

So variance = 22 ÷ 7 = 3.143

So, s.d. = 1.77°C (3 s.f.)

°CFirst we find the mean:

© Boardworks Ltd 20059 of 26

There is an alternative formula which is usually a more

convenient way to find the variance:

Standard deviation

( ) ( )2 2 2But, 2i i ix x x x x x 2 22i ix x x nx 2 22ix x nx nx 2 2

ix nx 2

2varianceix

xn

Therefore, and

2

2s.d.ix

xn

( )2

varianceix x

n

© Boardworks Ltd 200510 of 26

Example (continued): Looking again at the temperature

data for June: 21, 23, 24, 19, 19, 20, 21

Standard deviation

14721

7x

...2 2 2 221 23 21ix

°C

Also, = 3109

.

.

2

2 23109variance 21 3 143

7

s . 77.d 1

ixx

n

°C

Note: Essentially the standard deviation is a measure

of how close the values are to the mean value.

We know that

So,

© Boardworks Ltd 200511 of 26

When the data is presented in a frequency table, the formula

for finding the standard deviation needs to be adjusted slightly:

Calculating standard deviation from a table

2

2s.d.i i

i

f xx

f

Example: A class of 20

students were asked how

many times they exercise

in a normal week.

Find the mean and the

standard deviation.

Number of times

exercise taken

Frequency

0 5

1 3

2 5

3 4

4 2

5 1

© Boardworks Ltd 200512 of 26

Calculating standard deviation from a table

x × f x2 × f

0 0

3 3

10 20

12 36

8 32

5 25

No. of times

exercise taken, xFrequency, f

0 5

1 3

2 5

3 4

4 2

5 1

. .

2

2 2116s.d. 1 9 1 4

08

2

i i

i

f xx

f

The table can be extended to help find the mean and the s.d.

TOTAL: 20 38 116

.38

201 9x

© Boardworks Ltd 200513 of 26

If data is presented in a grouped frequency table, it is only

possible to estimate the mean and the standard deviation.

This is because the exact data values are not known.

An estimate is obtained by using the mid-point of an interval to

represent each of the values in that interval.

Example: The table

shows the annual mileage

for the employees of an

insurance company.

Estimate the mean and

standard deviation.

Calculating standard deviation from a table

Annual mileage, x Frequency

0 ≤ x < 5000 6

5000 ≤ x < 10,000 17

10,000 ≤ x < 15,000 14

15,000 ≤ x < 20,000 5

20,000 ≤ x < 30,000 3

© Boardworks Ltd 200514 of 26

Calculating standard deviation from a table

Mileage Frequency, f Mid-point, x f × x f × x2

0 – 5000 6 2500 15000 37,500,000

5000 – 10,000 17 7500 127,500 956,250,000

10,000 – 15,000 14 12,500 175,000 2,187,500,000

15,000 – 20,000 5 17,500 87,500 1,531,250,000

20,000 – 30,000 3 25,000 75,000 1,875,000,000

480,000

410

5,667x

TOTAL 45 480,000 6,587,500,000

26,587,500,000s.d. 10,667

47

55 11

miles

miles

© Boardworks Ltd 200515 of 26

In most distributions, about 67% of the data will lie within

1 standard deviation of the mean, whilst nearly all the

data values will lie within 2 standard deviations of the mean.

Values that lie more than 2 standard deviations from the

mean are sometimes classed as outliers – any such

values should be treated carefully.

Standard deviation is measured in the same units as the

original data. Variance is measured in the same units

squared.

Notes about standard deviation

Here are some notes to consider about standard deviation.

Most calculators have a built-in function which will find

the standard deviation for you. Learn how to use this

facility on your calculator.

© Boardworks Ltd 200516 of 26

Examination-style question:

The ages of the people in a

cinema queue one Monday

afternoon are shown in the

stem-and-leaf diagram:

Examination-style question

2 3 means 23 years old

2 3 6

3 1 6 6

4 1 2 5 6 9

5 0 4 7

6 1

a) Explain why the diagram suggests that the mean and

standard deviation can be sensibly used as measures of

location and spread respectively.

b) Calculate the mean and the standard deviation of the ages.

c) The mean and the standard deviation of the ages of the

people in the queue on Monday evening were 29 and

6.2 respectively. Compare the ages of the people

queuing at the cinema in the afternoon with those in the

evening.

© Boardworks Ltd 200517 of 26

a) The mean and the standard

deviation are appropriate, as

the distribution of ages is

roughly symmetrical and

there are no outliers.

Examination-style question

2 3 means 23 years old

2 3 6

3 1 6 6

4 1 2 5 6 9

5 0 4 7

6 1

b) . .597

597 so, 42 642861

44

2 6ix x

. .2 227,13127131 so, s.d. 42 64286

1410 9ix

c) The cinemagoers in the evening had a smaller mean

age, meaning that they were, on average, younger

than those in the afternoon.

The standard deviation for the ages in the evening was

also smaller, suggesting that the evening audience were

closer together in age.

© Boardworks Ltd 200518 of 26

Sometimes in examination questions you are asked to pool

two sets of data together.

Combining sets of data

Example: Six male and five female students sit an

A-level examination.

The mean marks were 52% and 57% for the males

and females respectively. The standard deviations

were 14 and 18 respectively.

Find the combined mean and the standard deviation

for the marks of all 11 students.

© Boardworks Ltd 200519 of 26

Let be the marks for the 6 male students.

Let be the marks of the 5 female students.

To find the overall mean, we first need to find the total

marks for all 11 students.

,...,1 6x x

,...,1 5y y

Combining sets of data

As 52x 6 52 312x

As 57y 5 57 285y

312 285 597x y

.. . %. .597

54 2727 31

541

Therefore

So the combined mean is:

© Boardworks Ltd 200520 of 26

To find the overall standard deviation, we need to find the

total of the marks squared for all 11 students.

As s.d. 14x

Therefore,

So the combined s.d. is: (to 3 s.f.)

Combining sets of data

As s.d. 18y

2

2s.d.ix

xn

( )2 2 2s.d.x n x

( )2 2 26 14 52 17,400x

( )2 2 25 18 57 17,865y 2 2 35,265x y

. . %235,26554 2 6 17

111

Notice that the formula

rearranges to give

© Boardworks Ltd 200521 of 26

Co

nte

nts

© Boardworks Ltd 200521 of 26

Calculating means

Calculating standard deviations

Coding

Coding

© Boardworks Ltd 200522 of 26

Coding is a technique that can simplify the numerical effort

required in finding a mean or standard deviation.

Enter some data below, and see how it changes when you

add or multiply by different numbers.

Coding

© Boardworks Ltd 200523 of 26

Adding

So, if a number b is added to each piece of data, the

mean value is also increased by b.

The standard deviation is unchanged.

i iy ax b

y ax b

s.d. s.d.y xa

Coding

More formally, if then:

Multiplying

If each piece of data is multiplied by a, the mean value

is multiplied by a.

The standard deviation is also multiplied by a.

© Boardworks Ltd 200524 of 26

Example: Find the mean and the standard deviation of the

values in the table. Use the transformation below to help you. 1

510

y x

Coding

x Frequency

50 3

60 5

70 7

80 4

90 1

y

0

1

2

3

4

Using the given transformation, add a y column to the table.

© Boardworks Ltd 200525 of 26

Coding

y Frequency, f

0 3

1 5

2 7

3 4

4 1

y × f y2 × f

0 0

5 5

14 28

12 36

4 16

.35

201 75y

Total 20 35 85

. .

2

2 285s.d. 1 75

21 09

0

i i

i

f yy

f

To find the mean:

To find the s.d.:

© Boardworks Ltd 200526 of 26

And the standard deviation of x is: 10 × 1.09 = 10.9

We can rearrange:

to get:

15

10y x

Therefore the mean of x is:

Coding

10 50x y

. .10 50 10 1 75 0 75 6 5x y

Note how the coding helped to simplify the

calculations by making the numbers smaller.

You have now found the mean and standard deviation of y.

To find them for the x values, you must reverse the coding.