last lecture summary which measures of central tendency do you know? which measures of variability...

40
Last lecture summary Which measures of central tendency do you know? Which measures of variability do you know? Empirical rule Population, census, sample, statistic, parameter Statistical inference 68% 95% 99 .7%

Upload: eustace-richardson

Post on 13-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Last lecture summary• Which measures of central tendency do you know?• Which measures of variability do you know?• Empirical rule

• Population, census, sample, statistic, parameter• Statistical inference

68% 95% 99.7%

Statistical jargon

Population - parameterMean Standard deviation

Sample - statisticMean Standard deviation

Výběr - statistikaVýběrový průměr Výběrová směrodatná odchylka

population (census) vs. sampleparameter (population) vs. statistic (sample)

Sampling• Representative sample, random sample• Sampling with/without replacement• Bias

New stuff

Bessel’s correction

𝑠=√∑ (𝑥𝑖−𝑥 )2

𝑛−1www.udacity.com – Statistics

Sample vs. population SD• We use sample standard deviation to approximate

population paramater

• But don’t get confused with the actual standard deviation of a small dataset.

• For example, let’s have this dataset: 5 2 1 0 7. Do you divide by or by ?

Median absolute deviation (MAD)• standard deviation is not robust• IQR is robust• mean absolute deviation MAD – a robust equivalent of the

standard deviation

• Take your data, find median, calculate absolute deviation from the median, find the median of absolutes deviations

Median absolute deviation (MAD)Data Median deviation Absolute deviation

5

10

30

20

30

5

15

10

15

Median:

MAD:

NORMAL DISTRIBUTION

Playing chess• Pretend I am a chess player.• Which of the following tells you most about how good I

am:1. My rating is 1800.

2. 8110th place among world competitive chess players.

3. Ranked higher than 88% of competitive chess players.

Distribution

Distribution of scores in one particular year

We should use relative frequencies and convert all absolute frequencies to proportions.

Height data – absolute frequencies

http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_Dinov_020108_HeightsWeights

Height data – relative frequencies

Height data – relative frequencies

30%

What proportion of values is between 170 cm and 173.75 cm?

173.5

Height data – relative frequencies

What proportion of values is between 170 cm and 175 cm?

We can’t tell for certain.

• How should we modify data/histogram to allow us a more detail?1. Adding more value to the dataset

2. Increasing the bin size

3. A smaller bin size

Height data – relative frequencies

What proportion of values is between 170 cm and 175 cm?

36%

Height data – relative frequencies

Height data – relative frequencies

Normal distribution

1

√2𝜎𝜋𝑒𝑥𝑝 {− (𝑥−𝜇)2

2𝜎2 }

recall the empirical rule

68-95-99.7

N(,)

STANDARD NORMAL DISTRIBUTION

Who is more popular?

Who is more popular

s.d. = 36

s.d. = 60

Z = -3.53

Z = -2.57

Standardizing

Formula

Quiz• What does a negative Z-score mean?

1. The original value is negative.

2. The original value is less than mean.

3. The original value is less than 0.

4. The original value minus the mean is negative.

Quiz II• If we standardize a distribution by converting every value

to a Z-score, what will be the new mean of this standardized distribution?

• If we standardize a distribution by converting every value to a Z-score, what will be the new standard deviation of this standardized distribution?

Standard normal distribution

N(,)

Z

Z – number of standard deviations away from the mean

If the Z-value is +1, how many percent are less than that value?

cca 84 %

0 +1 +2 +3-1-2-3

Proportion of human heights

+1-1-2 +20

Quiz• Approximately what proportion of people is smaller than

168 cm?

173 178 183168163

16%

Quiz• Approximately what proportion of people is higher than

183 cm?

173 178 183168163

2.5%

Quiz• Approximately what proportion of people is between 163

cm and 178 cm high?

173 178 183168163

81.5%

Quiz• Approximately what proportion of people is smaller than

180 cm?

173 178 183168163

ca 91.5%

Quiz• What is the probability of randomly selecting a height in

the sample that is >5 standard deviations above the mean?1. 0.01

2. 0.3

3. 0.8

4. 0.99

Quiz• What is the probability of randomly selecting a height in

the sample that is <5 standard deviations below the mean?1. 0.01

2. 0.3

3. 0.8

4. 0.99

Quiz• What proportion of the data is either below 2 standard

deviations or above 2 standard deviations from the mean for a normal distribution?

95%

2.5% 2.5%

Z-tableWhat is the proportion less than the point with the Z-score -2,75?