last lecture summary the nature of the normal distribution non-gaussian distributions

38
Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Upload: damian-rogers

Post on 17-Jan-2018

223 views

Category:

Documents


0 download

DESCRIPTION

Lognormal distribution Frazier et al. measured the ability of a drug isoprenaline to relax the bladder muscle. The results are expressed as the EC50, which is the concentration required to relax the bladder halfway between its minimum and maximum possible relaxation.

TRANSCRIPT

Page 1: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Last lecture summary• The nature of the normal distribution• Non-Gaussian distributions

Page 2: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

New stuff

Page 3: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Lognormal distribution• Frazier et al. measured the ability of a drug isoprenaline to

relax the bladder muscle.• The results are expressed as the EC50, which is the

concentration required to relax the bladder halfway between its minimum and maximum possible relaxation.

Page 4: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Lognormal distribution

Page 5: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Geometric mean

Geometric mean – transform all values to their logarithms, calculate the mean of the logarithms, transform this mean back to the units of original data (antilog)

𝑥=1333𝑛𝑀 𝑥=2.71 𝑥=102.71=513nM

Page 6: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

The nature of the lognormal distribution

• Lognormal distributions arise when multiple random factors are multiplied together to determine the value.• A typical example: cancer (cell division is multiplicative)

• Lognormal distributions are very common in many scientific fields.• Drug potency is lognormal

• To analyse lognormal data, do not use methods that assume the Gaussian distribution. You will get misleding results (e.g.,non-existing outliers).

• Better way is to convert data to logarithm and analyse the converted values.

Page 7: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

How normal is normal?

http://www.nate-miller.org/blog/how-normal-is-normal-a-q-q-plot-approach

Checking normality1. Eyball histograms2. Eyball QQ plots3. There are tests

Page 8: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

QQ plot• Q stands for ‘quantile’. Quantiles are values taken at

regular intervals from the data. The 2-quantile is called the median, the 3-quantiles are called terciles, the 4-quantiles are called quartiles (deciles, percentiles).

Page 9: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Typical normal QQ plot

http://emp.byui.edu/BrownD/Stats-intro/dscrptv/graphs/qq-plot_egs.htm

Page 10: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

QQ plot of left-skewed distribution

http://emp.byui.edu/BrownD/Stats-intro/dscrptv/graphs/qq-plot_egs.htm

Page 11: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

QQ plot of right-skewed distribution

http://emp.byui.edu/BrownD/Stats-intro/dscrptv/graphs/qq-plot_egs.htm

Page 12: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

SAMPLING DISTRIBUTIONSvýběrová rozdělení

Page 13: Last lecture summary The nature of the normal distribution Non-Gaussian distributions
Page 14: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Histogram

Page 15: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

𝒙=𝟏𝟗 .𝟒𝟒

𝒙=𝟏𝟕 .𝟐𝟐

𝒙=𝟏𝟔 .𝟖𝟗

Page 16: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution of sample mean• výběrové rozdělení výběrového průměru

Page 17: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sweet demonstration of the sampling distribution of the mean

Page 18: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

2

3

3

1

2

6

5

5

4

3

5

5

4

2

6

3

4

3

1

5

Page 19: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

2

3

3

1

2

6

5

5

4

3

5

5

4

2

6

3

4

3

1

5

průměr = 3.3

průměr = 1.7

Page 20: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Data 2015Population:4,3,3,5,0,4,4,4,3,4,2,6,8,2,4,3,5,7,3,3

25 samples (n=3) and their averages3,5,3,4,2,3,3,3,5,5,3,4,3,4,5,4,4,4,6,3,4,3,4,3,4

http://blue-lover.blog.cz/1106/lentilky

Page 21: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Histogram of 2015 data

Page 22: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

2015, n = 3, number of samples = 25

Page 23: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Going further• So far, we have generated 25 samples with n = 3.• To improve our histogram, we need more samples.• However, we don’t want to spend ages in the classroom.

• Thus, I have prepared a simulation for you. In this simulation, I use data from 2014 and I generate all possible samples, n = 3.

Page 24: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution, n = 3

1 540 samples

Page 25: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution, n = 5

42 504 samples

Page 26: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution, n = 10

20 030 010 samples

Page 27: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Central limit theorem (CLT)• The distribution of sample means is normal.

• The distribution of sample means is always normal irrespective of the underlying distribution.

• The distribution of sample means will increasingly approximate a normal distribution as a sample size increases.

Page 28: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Non-Gaussian distribution1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,5,5,6,7,7,8,8,8,9,9,9,9,10,10,10,10,10,11,11,11,11,11,11

Page 29: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution

n = 2

Page 30: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution

n = 4

Page 31: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution

n = 6

Page 32: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution

n = 8

Page 33: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Back to CLT• Once we know that the sampling distribution of the

sample mean is normal, we want to characterize this distribution.

• By which numbers you characterize a distribution?

mean

standard deviation

Page 34: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Back to CLT• Mean (sometime also denoted as ) of the sampling

distribution is equal to the population mean.

• Standard deviation (sometime also denoted as ) of the sampling distribution is equal to the population standard deviation divided by the square root of .• is called standard error (směrodatná chyba).

𝑆𝐸=𝜎 𝑥=𝜎√𝑛

𝑀 ¿𝜇𝑥=𝜇

Page 35: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

M and SELet’s have a look at our demonstration data:

1. Calculate population mean, population standard deviation and standard error for n=3.

2. Take all our sample means and calculate their mean. It should be close to the population mean.

3. Take all our sample means and calculate their standard deviation. It should be close to the standard error.

Page 36: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

M and SEpop_mean <- mean(data.set2015)pop_sd <- sd(data.set2015)*sqrt(19/20)se <- pop_sd/sqrt(3)

sampl_mean <- mean(prumery2015)sampl_sd <- sd(prumery2015)

Page 37: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Quiz• As the sample size increases, the standard error

• increases• decreases

• As the sample size increases, the shape of the sampling distribution gets• skinnier• wider

Page 38: Last lecture summary The nature of the normal distribution Non-Gaussian distributions

Sampling distribution applet

parent distribution

sample data

sampling distributions of selected statistics

http://onlinestatbook.com/stat_sim/sampling_dist/index.html