chapter-2 statistical description of quantitative variable

94
Chapter-2 Statistical Chapter-2 Statistical description of description of quantitative variable quantitative variable

Upload: colleen-grant

Post on 28-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Chapter-2 Statistical description of quantitative variable

Chapter-2 Statistical Chapter-2 Statistical

description of quantitative description of quantitative

variablevariable

Page 2: Chapter-2 Statistical description of quantitative variable

Teaching contents

In this section, we shall study descriptive

techniques of quantitative variable.

Section 1 Frequency distribution table and

frequency distribution graph

Section 2 Measures of central tendency

Section 3 Measures of dispersion tendency

Page 3: Chapter-2 Statistical description of quantitative variable

Teaching aimsTeaching aims

To learn the usage of frequency table

and graph.

To master the application of different

indexes.

Page 4: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Section 1 Frequency distribution

table and frequency distribution

graph

Page 5: Chapter-2 Statistical description of quantitative variable

part 1 Frequency distribution table and

graph of qualitative variable

part 2 Frequency distribution table and

graph of quantitative variable

part 3 Usage of frequency distribution

graph

Department of Health Statistics

NEXT

Page 6: Chapter-2 Statistical description of quantitative variable

[Example 1.1] university officials

periodically review the distribution of

undergraduate majors to help determine

a fair allocation of resources , and the

following data were obtained

college Number of majors

agriculture 1500

Arts and sciences 3200

education 1200

Engineering 4100

Department of Health Statistics

Table 1.1 the distribution of undergraduate majors

Page 7: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

backFig 1.1 the distribution of undergraduate majors

number of maj ors

0

1000

2000

3000

4000

5000

engi neeri ng arts andsci ences

agri cul ture educati on

Page 8: Chapter-2 Statistical description of quantitative variable

  [Example 1. 2 ] The techniques will be

illustrated using the Scottish Heart

Health Study, but for simplicity we shall

now take only one variable recorded on

50 subjects.

Department of Health Statistics

Page 9: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

5.75 6.29 6.13 6.78 6.46

6.76 5.98 6.25 6.31 5.99

6.47 5.71 5.19 4.35 5.35

7.11 6.89 6.05 7.01 5.86

5.42 4.92 7.12 5.85 5.64

7.04 6.23 5.71 6.74 6.36

5.75 7.71 6.19 7.55 6.76

7.14 5.73 6.73 7.86 5.51

6.02 6.54 5.34 6.92 7.15

6.55 7.16 4.79 6.64 6.83

Table 1.2 Serum total cholesterol (mmol/L) of 50 subjects from the Scottish Heart Health

Study

Page 10: Chapter-2 Statistical description of quantitative variable

How to describe the data in table 1.2?How to describe the data in table 1.2?

List all the data one by one, but it is

difficult for the reader to learn the

distribution character of 50 individuals.

Summarize it using specific index, which

is economical in space and easier for the

reader to understand.

Department of Health Statistics

Page 11: Chapter-2 Statistical description of quantitative variable

FREQUENCY DISTRIBUTION TABLE and FREQUENCY DISTRIBUTION TABLE and

FREQUENCY DISTRIBUTION GRAPHFREQUENCY DISTRIBUTION GRAPH

Step 1 to find MIN and MAX, and

compute range

Step 2 set up class intervals

Step 3 set all the data in one of the

class intervals

Department of Health Statistics

Page 12: Chapter-2 Statistical description of quantitative variable

MIN 4.35

MAX 7.86

RANGE 3.51

Range is the difference between MAX

and MIN

Department of Health Statistics

Step 1

Page 13: Chapter-2 Statistical description of quantitative variable

Divide the range by the approximate

number of class intervals.

Generally we will wish to have 7 to 15

class intervals, which is related with

sample size. The larger sample size is,

the more class intervals there are

accordingly.

Department of Health Statistics

Step 2

Page 14: Chapter-2 Statistical description of quantitative variable

Suppose we wish to have 7 class

intervals, then the interval width is

3.51(range)/7 ≈ 0.5

So we choose 0.5 as the interval

width .

Department of Health Statistics

Step 2

Page 15: Chapter-2 Statistical description of quantitative variable

Divide the range by the desired

number of subintervals.

Department of Health Statistics

Step 2

Your attention: The first subinterval

must contain MIN, and the last one

must include MAX.

Page 16: Chapter-2 Statistical description of quantitative variable

Construct frequency distribution and

keep a tally of the number of

measurements falling in a each

interval.

Department of Health Statistics

Step 3

Page 17: Chapter-2 Statistical description of quantitative variable

Your attention: Each class interval

include the lower limit (L), but not

the upper limit (U).

For example, there is a data of 5.5,

it should be in the forth group.

Department of Health Statistics

Step 3

Cholesterol

(mmol/L)

4.0-4.5

4.5-5.0

5.0-5.5

5.5-6.0

6.0-6.5

6.5-7.0

7.0-7.5

7.5-8.0

Page 18: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Lower limit

Cholesterol

(mmol/L) mark Frequency percentage

Cumulative

percentage

4.0-4.5 | 1 2% 2%

4.5-5.0 | | 2 4% 6%

5.0-5.5 | | | | 4 8% 14%

5.5-6.0 | | | | | | | | | | | 11 22% 36%

6.0-6.5 | | | | | | | | | | | 11 22% 58%

6.5-7.0 | | | | | | | | | | | 11 22% 80%

7.0-7.5 | | | | | | | 7 14% 94%

7.5-8.0 | | | 3 6% 100%

total 50 100%

Upper limit

Table 1.3 frequency distribution table for serum total cholesterol

Percentage is frequency divided by sample size(50)

Page 19: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Serum Cholesterol

7.757.256.756.255.755.254.754.25

frequency

12

10

8

6

4

2

0

Std. Dev = .76

Mean = 6.29

N = 50.00

3

7

111111

4

2

1

Fig 1.2 frequency distribution graph for serum total cholesterol

Page 20: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Serum Cholesterol

7.757.256.756.255.755.254.754.25

frequency

12

10

8

6

4

2

0

Std. Dev = .76

Mean = 6.29

N = 50.00

3

7

111111

4

2

1

number of maj ors

0

1000

2000

3000

4000

5000

engi neeri ng arts andsci ences

agri cul ture educati on

The difference

Page 21: Chapter-2 Statistical description of quantitative variable

Usage of frequency distribution Usage of frequency distribution graph graph

1 To describe the distribution

characters of frequency.

From table 3 and figure 2, we can know

serum total cholesterol of most people

is from 5.0 to 7.0 mol/L, the proportion

beyond is very small.

Department of Health Statistics

Page 22: Chapter-2 Statistical description of quantitative variable

How to describe the distribution How to describe the distribution characters of data?characters of data?

Central tendency

Dispersion tendency

Department of Health Statistics

Serum Cholesterol

7.757.256.756.255.755.254.754.25

frequency

12

10

8

6

4

2

0

Std. Dev = .76

Mean = 6.29

N = 50.00

3

7

111111

4

2

1

Page 23: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Describe How Data Are Distributed

Positive-SkewedNegative-Skewed Symmetric

Page 24: Chapter-2 Statistical description of quantitative variable

Mercury

concentrati on

(g/g) number

<0. 3 3

0. 3~ 17

0. 7~ 66

1. 1~ 60

1. 5~ 48

1. 9~ 18

2. 3~ 16

2. 7~ 6

3. 1~ 1

3. 5~ 1

3. 9~ 2

total 238

Table 2 Mercury concentrationOf hair in 238 health people

0

10

20

30

40

50

60

70

0. 3< 0. 3~ 0. 7~ 1. 1~ 1. 5~ 1. 9~ 2. 3~ 2. 7~ 3. 1~ 3. 5~ 3. 9~

ug/ g发汞值( )

人数

Mercury concentration

Of hair

num

be

r

Positive-Skewed

Page 25: Chapter-2 Statistical description of quantitative variable

table3 Myoglobin concentrationin blood serum of 101 normal people

Myogl obi n

concentrati on

(g/ ml )

number

0~ 2

5~ 3

10~ 7

15~ 9

20~ 10

25~ 22

30~ 23

35~ 14

40~ 9

45~50 2

101 0

5

10

15

20

25

0~ 5~ 10~ 15~ 20~ 25~ 30~ 35~ 40~ 45~

ug/ ml血清肌红蛋白( )

人数

num

be

r

Negative-Skewed

Myoglobin concentrationIn blood serum

Page 26: Chapter-2 Statistical description of quantitative variable

2 From the frequency distribution, we can

find the outlier ( too large or too small value)

very easily.

For instance, all the serum total cholesterol

is from 4.0 to 8.0, if one value is 28 (too

large, we think it’s impossible) , we called it

outlier and should check whether it is right.

3 It is a way of describing data.

Department of Health Statistics

Page 27: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Section 2 Measures of

central tendency

Page 28: Chapter-2 Statistical description of quantitative variable

arithmetic mean

geometric mean

Median and Percentile

Mode

2

1

3

4

Cen

tral te

nden

cy

Central tendency reflects the average

level of a series of measurements.

Page 29: Chapter-2 Statistical description of quantitative variable

The arithmetic meanThe arithmetic mean

[Definition] The arithmetic mean,

also called mean, is defined to be the

sum of the measurements divided by

the total number measurements.

Department of Health Statistics

Page 30: Chapter-2 Statistical description of quantitative variable

[symbols] the population mean is denoted by the Greek letter μ (read “mu”) and the sample mean is denoted by the symbol (read “X-bar”)

[Sample mean]

X

n

XX

Department of Health Statistics

n is the total number of observations.

X is a particular value.

(read “sigma”) indicates the operation

of adding.

mean

N

X[Population Mean][Population Mean]

Page 31: Chapter-2 Statistical description of quantitative variable

[example2.1] The mean score on a given

test can be found for an entire class. Take

a look at this American History class :

Department of Health Statistics

mean

Page 32: Chapter-2 Statistical description of quantitative variable

[solution] We find the mean score, by

adding all the scores together and

dividing by 10 (the number of

scores).

4.8210

85...7590

n

XX

Department of Health Statistics

mean

Page 33: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

All the values are included while

computing the mean.

The mean is easily affected by largest

or smallest values.

mean

[ Properties of the Arithmetic Mean][ Properties of the Arithmetic Mean]

0)( XX

Page 34: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[notice]

Mean can only be used in homogenous

data.

For example, we can compute the mean

height of ten-year-old boys. But it is

unscientific to calculate the mean height

of boys from 1 to 14 years.

Only when the distribution is normal, can

we compute mean.

mean

Page 35: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

mean

Mean can be

used.

Page 36: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Geometric MeanGeometric Mean

[Definition]

The geometric mean is defined as the

nth root of the product of the n

numbers.

[symbol] G

Geometric MeanGeometric Mean

Page 37: Chapter-2 Statistical description of quantitative variable

[formula][formula]

)lg

(1lg

lg

lg2

lg1

lg)21

lg(lg

21

n

XG

n

Xn

nXXX

nnXXX

G

or

nnXXXG

Department of Health Statistics

Geometric MeanGeometric Mean

Page 38: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Example 2.3] The antibody’s levels of

serum of six patients are listed.

1:10 , 1:20 , 1:40 , 1:80 , 1:80 , 1:1

60,

Please calculate the geometric mean?

Geometric MeanGeometric Mean

Page 39: Chapter-2 Statistical description of quantitative variable

[solution][solution]

Department of Health Statistics

Geometric MeanGeometric Mean

45)6522.1(lg

)6

160lg...20lg10lg(lg

)lg

(lg

1

1

1

n

XG

So the Geometric Mean is 1:45

X is reciprocal of antibody’s level; and lgX is the logarithm of reciprocal.

Sample size

Inverse logarithm

Page 40: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Usage of G ]

Geometric mean is often used in

geometric proportion data.

Such as 1:2 1:4 1:8 1:16 1:32

Geometric MeanGeometric Mean

Page 41: Chapter-2 Statistical description of quantitative variable

Median

[Definition]

The median, also called 50th percentile,

is the midpoint of the observations when

they are arranged in ascending order.

Department of Health Statistics

median

Page 42: Chapter-2 Statistical description of quantitative variable

[formula][formula]

When n is odd, the median is still the middle value when the data are arranged in ascending order.

)(2

11

22

nn XXM2

1 nXM

Department of Health Statistics

When n is even, the

median is the mean

of the middle two

values when the data

are arranged in

ascending order.

.

median

2/)(1

22

nn XXM

Page 43: Chapter-2 Statistical description of quantitative variable

[Example 2.5][Example 2.5]

Each of 7children in the second grade

was given a reading aptitude test, the

scores were as shown below.

95 86 64 81 75 76 69

Determine the median test score.

Department of Health Statistics

median

Page 44: Chapter-2 Statistical description of quantitative variable

[solution][solution]

Firstly, we must arrange the scores in

ascending order

64 69 75 76 81 86 95

There are 7 measurements, and the

forth is the midpoint value, so the

median is 76, or we can use formula

764

2

1 XXM n

Department of Health Statistics

median

Page 45: Chapter-2 Statistical description of quantitative variable

[Example 2.6][Example 2.6]

An experiment was conducted to measure the

effectiveness of a new procedure pruning grapes.

10 were assigned the task of pruning an acre of

grapes. The productivity, measured in worker-

hours/acre, is recorded for each person

4.4 4.9 3.8 5.2 4.7 4.6 5.4 3.8 4.0 4.3

Determine the median productivity for the group.

Department of Health Statistics

median

Page 46: Chapter-2 Statistical description of quantitative variable

[solution][solution]

Arrange the data in ascending order

3.8 3.8 4.0 4.3 4.4 4.6 4.7 4.9 5.2 5.4

Compute the mean of the 5th and 6th

5.42/)(2/)( 65

21

2

XXXXMnn

Department of Health Statistics

median

Page 47: Chapter-2 Statistical description of quantitative variable

[exercise][exercise]

Exercise capacity (in seconds) was

determined for each of 11 patients

being treated for chronic heart failure.

Department of Health Statistics

906 684 897 1320 1200 882

711 837 1008 1170 1056

Determine the median and mean.

median

Answer

Mean 970

Median 906

Page 48: Chapter-2 Statistical description of quantitative variable

When sample size is very larger or to

the grouped data, we can chose other

formula to compute median(P50).

Department of Health Statistics

median

Min

P0

Max

P100X% ( 100-X )

%

Px

M

P50

)%( Lx

x fnxf

iLP

)%50(50 Lm

fnf

iLP

Page 49: Chapter-2 Statistical description of quantitative variable

fx=frequency of the group including median

I = interval width

L: lower limit of the group including median.

is the cumulative frequency less than

the group including median.

)%50(50 Lm

fnf

iLP

Lf

Department of Health Statistics

median

Page 50: Chapter-2 Statistical description of quantitative variable

[Example 2.7 ][Example 2.7 ]

Determine the median in example 1.2

Department of Health Statistics

median

Page 51: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Lower limit

Cholesterol

(mmol/L) frequence percentage

Cumulative

frequence

Cumulative

percentage

4.0-4.5 1 2% 1 2%

4.5-5.0 2 4% 3 6%

5.0-5.5 4 8% 7 14%

5.5-6.0 11 22% 18 36%

6.0-6.5 11 22% 29 58%

6.5-7.0 11 22% 40 80%

7.0-7.5 7 14% 47 94%

7.5-8.0 3 6% 50 100%

total 50 100%

Upper limit

median

Page 52: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

To determine which interval the median

belongs to

we must find the first interval for

which the cumulative frequency

reaches 0.50. This interval will be the

one containing the median.

median

Page 53: Chapter-2 Statistical description of quantitative variable

For these data, the interval from 6.0

to 6.5 is the first interval for which the

cumulative frequency reaches 0.50, as

shown in the table, column 6. So this

interval contains the median. Then,

L=6.0 fm=11 n=50 i=0.5 =18

Lf

32.6182511

5.00.6)%(50 L

x

fnxf

iLP

Department of Health Statistics

median

Page 54: Chapter-2 Statistical description of quantitative variable

[Exercise][Exercise]

Calculate P25 and P75 in example 1.2

75.57%255011

5.05.5)%(25 L

x

fnxf

iLP

87.629%755011

5.05.6)%(75 L

x

fnxf

iLP

Department of Health Statistics

median

Page 55: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Properties of the Median][Properties of the Median]

It is not affected by extreme values.

It is the best index when there is no

exact value in one or two ends of the

distribution.

median

Page 56: Chapter-2 Statistical description of quantitative variable

[Exercise][Exercise]

One doctor measured the delitescence (days) of some infectious disease in 10 patients. The outcomes are as follows:

6 , 13 , 5 , 9 , 12 , 10 , 8 , 11 , 8 ,> 14

Please calculate the average delitescence.

Department of Health Statistics

median

Page 57: Chapter-2 Statistical description of quantitative variable

There is no exact value at the right end of There is no exact value at the right end of

distribution, so we should choose median. distribution, so we should choose median.

Firstly, we Sort the data from the smallest Firstly, we Sort the data from the smallest

to the largest oneto the largest one

5 6 8 8 9 10 11 12 13 > 14

calculate the mean of 9 and 10, it is 9.5

So the average delitescence is 9.5 days

Department of Health Statistics

[answer]median

Page 58: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Usage of median]

• Median can be used in any type of quantitative variable, not only for the data with the normal distribution, but also for the data with the skewed distribution or when there are some unknown values in the data.

• In symmetrical data, mean equals to median theoretically.

median

Page 59: Chapter-2 Statistical description of quantitative variable

Mode

[Definition] The mode of a set of

measurements is defined to be the

measurement that occurs most

often(with the highest frequency).

Department of Health Statistics

Page 60: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Example 2.8]

Please find out the mode of 9

undergraduates’ English scores

76 87 69 76 85 80 79 81 83

We will find that there are two ’76’ in this

example, so the mode is 76.

Page 61: Chapter-2 Statistical description of quantitative variable

Mode is the observation unit which

occur most often. In some cases,

perhaps there are more than one

modes.

Department of Health Statistics

Page 62: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

[Example 2.9]

Please find out the mode of 10 boy’s heights

(m).

1.45,1.50,1.32,1.37,1.45,1.60

1.48,1.41,1.35,1.50

We will find that there are two modes in

this example: 1.45 and 1.50.

Page 63: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Summary

In a normal distribution, the mean,

median, and mode are identical.

For normal distributions, the mean is the

most efficient and can reflect character

of all measurements.

Page 64: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Page 65: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Section 3 Measures of

dispersion tendency

Page 66: Chapter-2 Statistical description of quantitative variable

Central tendency can reflect the

average level of quantitative variable.

But it is not enough to know the central

tendency of the distribution only, we

should also describe the variation of

the observations.

Department of Health Statistics

Page 67: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Group A: 3 4 5 6 7

Group B: 1 3 5 7 9

Mean of group A=(3+4+5+6+7)/5=5

Mean of group B=(1+3+5+7+9)/5=5

The dispersions of the two groups are

different.

Page 68: Chapter-2 Statistical description of quantitative variable

Range

Quartile range

Variance or standard

deviation Coefficient of

variation

2

1

3

4

Disp

ersio

n te

nden

cy

Dispersion tendency reflects the

degree of variability of different

measurements.

Page 69: Chapter-2 Statistical description of quantitative variable

[Definition]

Department of Health Statistics

Value(min)-Value(max)Range

Range is the difference between MAX

and MIN.

range

Page 70: Chapter-2 Statistical description of quantitative variable

[example 3.1][example 3.1]

Determine the range of the following data set.

1, 6, 2, 3, 9, 7, 5

[solution 3.1]

RANGE=9-1=8.

Department of Health Statistics

range

Page 71: Chapter-2 Statistical description of quantitative variable

Merit of range

It is the simplest

measurement of

data variability.

limitation of range

It is least useful for it

can only reflect the

difference between

MAX and MIN. And it is

easily affected by

extreme value.

Department of Health Statistics

range

Page 72: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

The interquartile range is the distance

between the third quartile Q3 (P75) and the

first quartile Q1 (P25) .

This distance will include the middle 50

percent of the observations.

Interquartile range = Q3 - Q1

[Definition]

25% 25% 25% 25%

L Q1 Q2 Q3 U

interquartile Rangeinterquartile Range

Page 73: Chapter-2 Statistical description of quantitative variable

[Example 3.2]

Calculate the IQR in example 1.1

in virtue of the following table.

Department of Health Statistics

interquartile Rangeinterquartile Range

Page 74: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Lower limit

Cholesterol

(mmol/L) frequence percentage

Cumulative

frequence

Cumulative

percentage

4.0-4.5 1 2% 1 2%

4.5-5.0 2 4% 3 6%

5.0-5.5 4 8% 7 14%

5.5-6.0 11 22% 18 36%

6.0-6.5 11 22% 29 58%

6.5-7.0 11 22% 40 80%

7.0-7.5 7 14% 47 94%

7.5-8.0 3 6% 50 100%

total 50 100%

Upper limit

interquartile Rangeinterquartile Range

Page 75: Chapter-2 Statistical description of quantitative variable

[Solution 3.2] [Solution 3.2]

Above all, we should calculate PAbove all, we should calculate P2525 and P and P7575

75.57%255011

5.05.5)%(25 L

x

fnxf

iLP

87.629%755011

5.05.6)%(75 L

x

fnxf

iLP

Department of Health Statistics

IQR=6.87-5.75=1.12

interquartile Rangeinterquartile Range

Page 76: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

IQR(Q), although more sensitive to

data pileup about the midpoint than

the range, is still not sufficient for our

purpose. It can only reflect the

variability of middle 50%

measurements. And also, it is limited

in interpreting the variability of s

single set of measurements.

[Properties]interquartile Rangeinterquartile Range

Page 77: Chapter-2 Statistical description of quantitative variable

The population variance of a set of

n measurements x1,x2… with

arithmetic mean μ is the sum of

the squared deviations divided by

n.

Department of Health Statistics

[ Definition]

variance

2

2

( )X

N

Page 78: Chapter-2 Statistical description of quantitative variable

The sample variance of a set of n

measurements x1,x2… with arithmetic

mean is the sum of the squared

deviations divided by n-1.

X

Department of Health Statistics

[ Definition]

variance

1

)( 22

n

XXs

Page 79: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

variance

1

)( 22

n

XXs

mean

Degree of freedom

2)( XX

is the squared deviation

Page 80: Chapter-2 Statistical description of quantitative variable

[Example 3.3]

The time between an electric light stimulus and a bar press to avoid a shock was noted for each of five conditioned rats. Use the data below to compute the sample variance.

Shock avoidance times (seconds): 5,4,3,1,3

Department of Health Statistics

variance

Page 81: Chapter-2 Statistical description of quantitative variable

[Solution 3.3][Solution 3.3]

Department of Health Statistics

XX i 2)( XX i Xi

5 1.8 3.24

6 0.8 0.64

7 -0.2 0.04

8 -2.2 4.84

3 - 0.2 0.04

TOTAL 16 0 8.80

The deviations and the squared deviations are shown below. The sample mean is 3.2

variance

Page 82: Chapter-2 Statistical description of quantitative variable

[Solution 3.3][Solution 3.3]

Using the total of the squared deviations column, we find the sample variance to be

2.24

8.8

1

)( 22

n

XXs

Department of Health Statistics

variance

Page 83: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

All values are used in the calculation.

Not influenced by extreme values.

The units of variance is difficult to

explain, It is the square of the original

units.

[Properties]

variance

Page 84: Chapter-2 Statistical description of quantitative variable

[definition]

Standard deviation is the positive

square root of the variance.

[symbol]

Population standard deviation σ

Sample standard deviation S

Department of Health Statistics

Standard deviation

N

X 2)(

1

)( 2

n

XXS

Page 85: Chapter-2 Statistical description of quantitative variable

[Example 3.4][Example 3.4]

Calculate the sample standard deviation in Example 3.3

[solution 3.4]

48.12.24

8.8

1

)( 2

n

XXs

Department of Health Statistics

Standard deviation

Page 86: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

– It is the best measurement describing

the variability of quantitative variable,

which can reflect the variability of any

data.

–Only when the data come from normal

distribution, can it be used.

[Properties ]

Standard deviation

Page 87: Chapter-2 Statistical description of quantitative variable

[definition]

The coefficient of variation is the ratio of

the standard deviation to the arithmetic

mean, expressed as a percentage:

Department of Health Statistics

%100X

sCV

Coefficient of VariationCoefficient of Variation

Page 88: Chapter-2 Statistical description of quantitative variable

[Usage][Usage]

The measurements with different units,

such as the variability comparison of height

(cm) and weight (kg)

When the mean of two groups is quite

different, one is very small, while the other

is very large. such as the weight of

elephants and infants

Department of Health Statistics

Coefficient of VariationCoefficient of Variation

Page 89: Chapter-2 Statistical description of quantitative variable

[example 3.6][example 3.6]

kgSkgXWeight

cmScmXHeight

7,64:

5.8,165:

Department of Health Statistics

One doctor measured the heights and

weights of 50 people, the outcome is

Compare which variability is much larger

between height and weight?

Coefficient of VariationCoefficient of Variation

Page 90: Chapter-2 Statistical description of quantitative variable

[Solution 3.6][Solution 3.6]

%9.10%10064/7:

%15.5%100165/5.8:

CVWeight

CVHeight

Department of Health Statistics

So the variability of weight is much larger.

Coefficient of VariationCoefficient of Variation

Page 91: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Page 92: Chapter-2 Statistical description of quantitative variable

Department of Health Statistics

Page 93: Chapter-2 Statistical description of quantitative variable

SX

Description of data from normal distribution

)( 7525 PPM

Description of data from skewed distribution

Page 94: Chapter-2 Statistical description of quantitative variable

94