1 probabilistic and statistical techniques lecture 5 dr. nader okasha

19
1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Upload: cora-west

Post on 20-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

1

Probabilistic and Statistical Techniques

Lecture 5

Dr. Nader Okasha

Page 2: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Descriptive measures of data:

1. Measure of Center

2. Measure of Variation

3. Measure of Position

Page 3: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

The range of a set of data is the difference between the maximum value and the minimum value.

Range = (maximum value) – (minimum value)

The range is very easy to compute but because it depends on only the highest and the lowest values, it isn't as useful as the other measures of variation that use every value.

Range

Page 4: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

The standard deviation of a set of sample values is a measure of variation of values about the mean.

)1(

2

n

xxs

Sample Standard Deviation Formula

Sample Standard Deviation

Page 5: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

)1(

22

nn

xxns

Sample Standard Deviation (Shortcut Formula)

Page 6: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Example

For the data set determine: Standard deviation

Solution

4.7)9(10

500)25494(10

)1(

2

22

s

nn

xxns

x x2

41 168144 193645 202547 220947 220948 230451 260153 280958 336466 4356

Sum 500 25494

Page 7: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Standard Deviation - Important Properties

The standard deviation is a measure of variation of all values from the mean.

The value of the standard deviation s is usually positive.

The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others).

The units of the standard deviation s are the same as the units of the original data values.

Page 8: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Population Standard Deviation

This formula is similar to the previous formula, but instead, the population mean and population size are used.

N

x

2

Page 9: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Use interval midpoint for variable x

Standard deviation from a Histogram

1

.2

n

fxxs

Interval Mid point Number of counts

Page 10: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Example

76.3576

2718 .

f

xfx

Class limits Class Mid point x No. counts f f.x (x - x')2 . f21 - 30 25.5 28 714 2947.531 - 40 35.5 30 1065 2.041 - 50 45.5 12 546 1138.451 - 60 55.5 2 111 779.361 - 70 65.5 2 131 1768.971 - 80 75.5 2 151 3158.5

Sum 76 2718 9795

42.11

176

9795

1

.2

n

fxxs

Page 11: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Population variance : Square of the population standard deviation

Variance

The variance of a set of values is a measure of variation equal to the square of the standard deviation.

Sample variance s2: Square of the sample standard deviation

2

Page 12: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Range Rule of Thumb

If the standard deviation is known, we can use it to find rough estimates of the minimum and maximum ‘usual’ sample values as follows:

minimum usual value (mean) - 2 * (standard deviation)

maximum usual value (mean) + 2 * (standard deviation)

Page 13: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Example

Results from the National Health survey show that the heights of men have a mean of 69 in and a standard deviation of 2.8 in. use the range rule of thumb to find the minimum and maximum usual heights.

minimum usual value = (mean) - 2 * (standard deviation)

= 69 -2*2.8 = 63.4 in

maximum usual value = (mean) + 2 * (standard deviation)

= 69+2*2.8 = 74.6 in

Page 14: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Empirical Rule

For data sets having a distribution that is approximately bell shaped, the following properties apply:

About 68% of all values fall within 1 standard deviation of the mean.

About 95% of all values fall within 2 standard deviations of the mean.

About 99.7% of all values fall within 3 standard deviations of the mean.

Page 15: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

The Empirical Rule

Page 16: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

The Empirical Rule

Page 17: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

The Empirical Rule

Page 18: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Chebyshev Theorem

The proportion (fraction) of any set of data lying within K standard deviations of the mean is always at least 1-1/K2 , where K is any positive number greater than 1. For K= 2 and K= 3, we get the following results.

- At least 3/4 of the values lie within 2 s.d. of the mean

- At least 8/9 of the values lie within 3 s.d. of the mean

Page 19: 1 Probabilistic and Statistical Techniques Lecture 5 Dr. Nader Okasha

Coefficient of variation

The coefficient of variation (or CV) for a set of sample or population data, expressed as a percent, describes the standard deviation relative to the mean.

Sample

Population

100

CV100x

sCV