the four fundamental concepts of statistics - george...

23
The Four Fundamental Concepts of Statistics Principles underlying inferential statistics (as described by Sally Caldwell)

Upload: lamdang

Post on 09-Mar-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

The Four Fundamental Concepts of Statistics

Principles underlying inferential statistics

(as described by Sally Caldwell)

Page 2: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Census vs sample?

In most research a census is not possible- find the mean height of a GBC student (census is impossible) - Find mean GPA of GBC student (census is available).

Solution: estimate the population characteristic with a sample.

Page 3: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Flowchart:Making inferences with one variable

Page 4: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

What happened in exercise 15?

Page 5: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

symbols you will need to learn by heart

– sample meanμ – population mean

s – sample standard deviationσ – population standard deviation

Page 6: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Concept #1 Random Sampling

In order to be able to make valid inferences about a population sampling must be done

randomly.

Page 7: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

To choose a sample randomly means that:

1. Every member of the population being studied has an equal chance of being chosen.

2. Each individual is chosen independently.

3. All combinations – even the weirdest ones are possible.

Page 8: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

The sampling frame (i.e. target population) is the physical representation of the population from which the sample is chosen.

It could be a list of names, or a description of the ‘boundary’ within which the target population can be found.

Page 9: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Concept #2 Sampling Error

Each sample contains only a fraction of the population.

an infinite # of samples is possible, each one different from each other.

most samples will approximate the population, and some will be way off, nevertheless each one will contain some ‘error’.

This ‘error’ is called sampling error, and because of it, we cannot make automatic inferences from sample to population.

Page 10: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Concept #3 Distribution of Sample Means

Imagine that the researcher is able to keep taking samples and calculating the mean for each one.This will create a set of sample means as follows: { }

The collected and ordered data set containing sample means is called the distribution of sample means.

Page 11: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Sampling 6 dice rolls and the distribution of sample means

Each sample mean (randomly selected of course) is one sample from the infinite possible sample means – i.e. from the distribution of sample means.

Page 12: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Shift from population distribution to distribution of sample means (a.k.a. shift from z-score to t-score)

Population distribution

Distribution ofsample means

Normal or other normal

Mean = μ Mean = μ

Std deviation = σ Std error

Raw score - x Raw score -

For areas – use z-tables For areas – use t-tables

Page 13: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Moving from the z-tables to the t-tablesThe z-score relates a raw score ‘x’ to the population distribution (normal) with a population mean μ.

The t-score relates a raw score to the distribution of sample means with a population mean μ (every sample size has its own normal distribution)

t-tables adjust to various sample sizes with degrees of freedom = sample size (n)– 1.

the higher the sample size the closer the distribution is to the z-table.

x

Page 14: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Concept #4 Central Limit Theorem

If the population (from which the samples are taken) has mean = µ, and standard deviation=σ

then the distribution of sample means will have mean=µ. And standard deviation (called standard error)=

- note, as n (sample size) gets larger the standard error gets smaller

n

Page 15: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Shift from Z-score to t-score

z-score

t-score

ns

xt

xz

Estimate for σ

n is the sample size; s is the sample standard deviation Degrees of freedom = n - 1

Page 16: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Distribution of ages of residents of an old age home

Theoretical Distribution of Sample Means taken from the above population

Age distributed normally with µ = 78 and σ = 16.

Each x is an individual person.

If one takes a large number of samples of size n= 9.  the theoretical distribution of sample means will havea normal distribution and µ = 78 and standard error = 

5.33Each ‘M’ is a separate sample mean (i.e.  ̅ , ̅ , etc.)

Creative Commons Attribution-ShareAlike 4.0 International License. Adapted from:Bergen, A. & Stanley, D. (2014). Foundations of hypothesis testing: Probabilities, proportions, and z‐scores. Lecture notes for Quantification in Psychology, W14 Semester, University of Guelph.

Page 17: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Sampling error and the confidence interval

Example: When taking a sample of heights of GBC students we can calculate the mean height of the sample ( =167)

Is that mean equal to the population mean (µ) ? Probably not!!!

Statisticians recognized this and build and created an interesting way to estimate µ using µ = +/– a little bit; then added a probability of being correct The +/– a little bit accounts for an estimate of the size of the error due to sampling.

x

x

Page 18: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

The confidence interval for means

nstx

Population meanSample mean Sample standard deviation

Sample sizet-score, dependent onConfidence level And degrees of freedomd.f. = n - 1

The value in theBrackets is the Standard error

Page 19: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

What is it for?The confidence interval aims to capture the

population mean in a range. Once you solve for the unknowns you will get a statement like the following :

We are 95% confident that μ = 167 ± 4.5 ( ± error )

or…. The 95% Confidence interval for height is 162.5 ≤ μ ≤ 171.5

or… We are 95% Confident that the population mean height is between 162.5 and 171.5

The interval

Page 20: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Confidence intervals: the fundamentals

Statisticians don’t say “The mean is 4.3”.

We don’t say “The mean is probably 4.3”

We don’t say “The mean is close to 4.3”.

All we can manage is: “The mean is close to 4.3 …. Probably….. And we go on at great length to tell you how wrong we probably are.”

Page 21: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Population of IQ Scores (Xs)

Correct:A confidence interval is an interval estimate, based on the sample mean, that includes the population mean a certain percentage of the time (e.g., 95%), were we to sample from the same population repeatedly.

WRONG:The range of sample means we would get if we repeatedly sampled from the same population

WRONG:The range of a certain percentage of population values centered around the mean.

100.00 15.00

Creative Commons Attribution-ShareAlike 4.0 International License. Adapted from:Bergen, A. & Stanley, D. (2014). Foundations of hypothesis testing: Probabilities, proportions, and z‐scores. Lecture notes for Quantification in Psychology, W14 Semester, University of Guelph.

Page 22: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

The confidence interval for proportions

Population proportionSample proportion Estimate of variation

Sample sizez-score, dependent onConfidence level only

nppzp )1(

Page 23: The Four Fundamental Concepts of Statistics - George …faculty.georgebrown.ca/~tgula/stat1013/week12 Four Fundamental... · -Find mean GPA of GBC student (census is available). Solution:

Practice and homework

Work on in class exercises 17-18