measures of variability variability. measure of variability (dispersion, spread) variance, standard...

52
Measures of Variability 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 5 10 15 20 25 Variabil ity

Upload: cynthia-harrington

Post on 12-Jan-2016

295 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Measures of Variability

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

Variability

Page 2: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Measure of Variability (Dispersion, Spread)

• Variance, standard deviation

• Range

• Inter-Quartile Range

• Pseudo-standard deviation

Page 3: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Range

Page 4: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Range

Definition

Let min = the smallest observation

Let max = the largest observation

Then Range =max - min

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

Range

Page 5: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Inter-Quartile Range (IQR)

Page 6: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Inter-Quartile Range (IQR)

Definition

Let Q1 = the first quartile,

Q3 = the third quartile

Then the

Inter-Quartile Range

= IQR = Q3 - Q1

Page 7: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25Q1 Q3

25% 25%

50%

Inter-Quartile Range

Page 8: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

The data Verbal IQ on n = 23 students arranged in increasing order is:

80 82 84 86 86 89 90 94

94 95 95 96 99 99 102 102

104 105 105 109 111 118 119

Page 9: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

The data Verbal IQ on n = 23 students arranged in increasing order is:

80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119

Q2 = 96Q1 = 89 Q3 = 105min = 80 max = 119

Page 10: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Range

Range = max – min = 119 – 80 = 39

Inter-Quartile Range

= IQR = Q3 - Q1 = 105 – 89 = 16

Page 11: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Some Comments

• Range and Inter-quartile range are relatively easy to compute.

• Range slightly easier to compute than the Inter-quartile range.

• Range is very sensitive to outliers (extreme observations)

Page 12: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Varianceand

Standard deviation

Page 13: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Sample Variance

Let x1, x2, x3, … xn denote a set of n numbers.

Recall the mean of the n numbers is defined as:

n

xxxxx

n

xx nn

n

ii

13211

Page 14: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The numbers

are called deviations from the the mean

xxd 11

xxd 22

xxd 33

xxd nn

Page 15: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The sum

is called the sum of squares of deviations from the the mean.

Writing it out in full:

or

n

ii

n

ii xxd

1

2

1

2

223

22

21 ndddd

222

21 xxxxxx n

Page 16: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Sample Variance

Is defined as the quantity:

and is denoted by the symbol

111

2

1

2

n

xx

n

dn

ii

n

ii

2s

Page 17: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

Let x1, x2, x3, x3 , x4, x5 denote a set of 5 denote the set of numbers in the following table.

i 1 2 3 4 5

xi 10 15 21 7 13

Page 18: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Then

= x1 + x2 + x3 + x4 + x5

= 10 + 15 + 21 + 7 + 13

= 66

and

5

1iix

n

xxxxx

n

xx nn

n

ii

13211

2.135

66

Page 19: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The deviations from the mean d1, d2, d3, d4, d5 are given in the following table.

i 1 2 3 4 5

xi 10 15 21 7 13

di -3.2 1.8 7.8 -6.2 -0.2

Page 20: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The sum

and

n

ii

n

ii xxd

1

2

1

2

22222 2.02.68.78.12.3

80.112

04.044.3884.6024.324.10

2.28

4

8.112

11

2

2

n

xxs

n

ii

Page 21: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Sample Standard Deviation s

Definition: The Sample Standard Deviation is defined by:

Hence the Sample Standard Deviation, s, is the square root of the sample variance.

111

2

1

2

n

xx

n

ds

n

ii

n

ii

Page 22: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

In the last example

31.52.28

4

8.112

11

2

2

n

xxss

n

ii

Page 23: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Interpretations of s

• In Normal distributions– Approximately 2/3 of the observations will lie

within one standard deviation of the mean– Approximately 95% of the observations lie

within two standard deviations of the mean– In a histogram of the Normal distribution, the

standard deviation is approximately the distance from the mode to the inflection point

Page 24: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

s

Inflection point

Mode

Page 25: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

s

2/3

s

Page 26: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

2s

Page 27: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

A researcher collected data on 1500 males aged 60-65.

The variable measured was cholesterol and blood pressure.

– The mean blood pressure was 155 with a standard deviation of 12.

– The mean cholesterol level was 230 with a standard deviation of 15

– In both cases the data was normally distributed

Page 28: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Interpretation of these numbers

• Blood pressure levels vary about the value 155 in males aged 60-65.

• Cholesterol levels vary about the value 230 in males aged 60-65.

Page 29: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

• 2/3 of males aged 60-65 have blood pressure within 12 of 155. Ii.e. between 155-12 =143 and 155+12 = 167.

• 2/3 of males aged 60-65 have Cholesterol within 15 of 230. i.e. between 230-15 =215 and 230+15 = 245.

Page 30: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

• 95% of males aged 60-65 have blood pressure within 2(12) = 24 of 155. Ii.e. between 155-24 =131 and 155+24 = 179.

• 95% of males aged 60-65 have Cholesterol within 2(15) = 30 of 230. i.e. between 230-30 =200 and 230+30 = 260.

Page 31: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Measures of Variability

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

Variability

Page 32: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Measure of Variability (Dispersion, Spread)

• Variance, standard deviation

• Range

• Inter-Quartile Range

• Pseudo-standard deviation

Page 33: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Range

Range =max – min

Interquartile range (IQR)

IQR = Q3 – Q1

Page 34: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Sample Variance

111

2

1

2

2

n

xx

n

ds

n

ii

n

ii

2s

Page 35: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Sample standard deviation

111

2

1

2

n

xx

n

ds

n

ii

n

ii

2s

Page 36: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

A Computing formula for:

Sum of squares of deviations from the the mean :

The difficulty with this formula is that will have many decimals.

The result will be that each term in the above sum will also have many decimals.

n

ii xx

1

2

x

Page 37: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The sum of squares of deviations from the the mean can also be computed using the following identity:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

Page 38: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

To use this identity we need to compute:

and 211

n

n

ii xxxx

222

21

1

2n

n

ii xxxx

Page 39: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Then:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

11 and

2

1

1

2

1

2

2

nn

x

x

n

xxs

n

iin

ii

n

ii

Page 40: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

11

and

2

1

1

2

1

2

nn

x

x

n

xxs

n

iin

ii

n

ii

Page 41: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

The data Verbal IQ on n = 23 students arranged in increasing order is:

80 82 84 86 86 89 90 94

94 95 95 96 99 99 102 102

104 105 105 109 111 118 119

Page 42: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

= 80 + 82 + 84 + 86 + 86 + 89

+ 90 + 94 + 94 + 95 + 95 + 96 + 99 + 99 + 102 + 102 + 104

+ 105 + 105 + 109 + 111 + 118 + 119 = 2244

= 802 + 822 + 842 + 862 + 862 + 892

+ 902 + 942 + 942 + 952 + 952 + 962 + 992 + 992 + 1022 + 1022 + 1042

+ 1052 + 1052 + 1092 + 1112

+ 1182 + 1192 = 221494

n

iix

1

n

iix

1

2

Page 43: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Then:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

652.2557

23

2244221494

2

Page 44: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

11 and

2

1

1

2

1

2

2

nn

x

x

n

xxs

n

iin

ii

n

ii

26.116

22

652.2557

2223

2244221494

2

Page 45: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

11 Also

2

1

1

2

1

2

nn

x

x

n

xxs

n

iin

ii

n

ii

26.116

22

652.2557

2223

2244221494

2

782.10

Page 46: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

A quick (rough) calculation of s

The reason for this is that approximately all (95%) of the observations are between

and

Thus

4

Ranges

sx 2.2sx

sx 2max .2min and sx .22minmax and sxsxRange

s4

4

Range Hence s

Page 47: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

Verbal IQ on n = 23 students min = 80 and max = 119

This compares with the exact value of s which is 10.782.The rough method is useful for checking your calculation of s.

75.94

39

4

80-119s

Page 48: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Pseudo Standard Deviation (PSD)

Page 49: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

The Pseudo Standard Deviation (PSD)

Definition: The Pseudo Standard Deviation (PSD) is defined by:

35.1

Range ileInterQuart

35.1

IQRPSD

Page 50: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Properties

• For Normal distributions the magnitude of the pseudo standard deviation (PSD) and the standard deviation (s) will be approximately the same value

• For leptokurtic distributions the standard deviation (s) will be larger than the pseudo standard deviation (PSD)

• For platykurtic distributions the standard deviation (s) will be smaller than the pseudo standard deviation (PSD)

Page 51: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Example

Verbal IQ on n = 23 students Inter-Quartile Range

= IQR = Q3 - Q1 = 105 – 89 = 16

Pseudo standard deviation

This compares with the standard deviation

85.1135.1

16

35.1

IQRPSD

782.10s

Page 52: Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard

Summary