sampling distributions note: homework 3 due 3/23
Post on 20-Dec-2015
218 views
TRANSCRIPT
![Page 1: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/1.jpg)
Sampling Distributions
Note: Homework 3 due 3/23
![Page 2: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/2.jpg)
Probability distributions
• If we measure a random variable many times, we can build up a distribution of the values it can take.
• Imagine an underlying distribution of values which we would get if it was possible to take more and more measurements under the same conditions.
• This gives the probability distribution for the variable.
![Page 3: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/3.jpg)
Continuous probability distributions
• Because continuous random variables can take all values in a range, it is not possible to assign probabilities to individual values.
• Instead we have a continuous curve, called a probability density function, which allows us to calculate the probability a value within any interval.
• This probability is calculated as the area under the curve between the values of interest. The total area under the curve must equal 1.
![Page 4: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/4.jpg)
Normal (Gaussian) distributions
• Normal (also known as Gaussian) distributions are by far the most commonly used family of continuous distributions.
• They are ‘bell-shaped’ –and are indexed by two parameters:– The mean – the distribution is symmetric about this
value – The standard deviation – this determines the spread
of the distribution. Roughly 2/3 of the distribution lies within 1 standard deviation of the mean, and 95% within 2 standard deviations.
![Page 5: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/5.jpg)
The probability of continuous variables
• IQ test– Mean = 100 and sd = 15
• What is the probability of randomly selecting an individual with a test score of 130 or greater?– P(X ≤ 95)?– P(X ≥ 112)?– P(X ≤ 95 or X ≥ 112)?
![Page 6: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/6.jpg)
The probability of continuous variables (cont.)
• What is the probability of randomly selecting three people with a test score greater than 112?– Remember the multiplication rule for
independent events.
![Page 7: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/7.jpg)
Introduction to Statistical Inference
Chapter 11
![Page 8: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/8.jpg)
Populations vs. Samples
• Population– The complete set of individuals
• Characteristics are called parameters
• Sample– A subset of the population
• Characteristics are called statistics.
– In most cases we cannot study all the members of a population
![Page 9: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/9.jpg)
![Page 10: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/10.jpg)
Inferential Statistics
• Statistical Inference– A series of procedures in which the data
obtained from samples are used to make statements about some broader set of circumstances.
![Page 11: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/11.jpg)
Two different types of procedures
• Estimating population parameters– Point estimation
• Using a sample statistic to estimate a population parameter
– Interval estimation• Estimation of the amount of variability in a sample statistic
when many samples are repeatedly taken from a population.
• Hypothesis testing– The comparison of sample results with a known or
hypothesized population parameter
![Page 12: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/12.jpg)
These procedures share a fundamental concept
• Sampling distribution– A theoretical distribution of the possible
values of samples statistics if an infinite number of same-sized samples were taken from a population.
![Page 13: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/13.jpg)
Example of the sampling distribution of a discrete variable
Binomial sampling distribution of an unbiased coin tossed 10 times
0
0.05
0.1
0.150.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10
Number of heads in 10 tosses
p(x
)
![Page 14: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/14.jpg)
Continuous Distributions
• Interval or ratio level data– Weight, height, achievement, etc.
• JellyBlubbers!!!
![Page 15: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/15.jpg)
Histogram of the Jellyblubber population
![Page 16: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/16.jpg)
Repeated sampling of the Jellyblubber population (n = 3)
![Page 17: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/17.jpg)
Repeated sampling of the Jellyblubber population (n = 5)
![Page 18: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/18.jpg)
Repeated sampling of the Jellyblubber population (n = 10)
![Page 19: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/19.jpg)
Repeated sampling of the Jellyblubber population (n = 40)
![Page 20: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/20.jpg)
For more on this concept
• Visit– http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html
![Page 21: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/21.jpg)
Central Limit Theorem
• Proposition 1:– The mean of the sampling
distribution will equal the mean of the population.
• Proposition 2:– The sampling distribution of
means will be approximately normal regardless of the shape of the population.
• Proposition 3:– The standard deviation
(standard error) equals the standard deviation of the population divided by the square root of the sample size. (see 11.5 in text)
x
Nx
![Page 22: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/22.jpg)
Application of the sampling distribution
• Sampling error– The difference between the sample mean and the population
mean.• Assumed to be due to random error.
• From the jellyblubber experience we know that a sampling distribution of means will be randomly distributed with
x Nx
![Page 23: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/23.jpg)
Standard Error of the Mean and Confidence Intervals
• We can estimate how much variability there is among potential sample means by calculating the standard error of the mean.
Nes
x
..
![Page 24: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/24.jpg)
Confidence Intervals
• With our Jellyblubbers– One random sample (n = 3)
• Mean = 9– Therefore;
• 68% CI = 9 + or – 1(3.54)• 95% CI = 9 + or – 1.96(3.54)• 99% CI = 9 + or – 2.58(3.54)
54.33
132.6..
xes
![Page 25: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/25.jpg)
Confidence Intervals
• With our Jellyblubbers– One random sample (n = 30)
• Mean = 8.90– Therefore;
• 68% CI = 8.90 + or – 1(1.11)• 95% CI = 8.90 + or – 1.96(1.11)• 99% CI = 8.90 + or – 2.58(1.11)
11.130
132.6..
xes
![Page 26: Sampling Distributions Note: Homework 3 due 3/23](https://reader035.vdocuments.mx/reader035/viewer/2022062714/56649d4e5503460f94a2dac9/html5/thumbnails/26.jpg)
Hypothesis Testing (see handout)
1. State the research question.
2. State the statistical hypothesis.
3. Set decision rule.
4. Calculate the test statistic.
5. Decide if result is significant.
6. Interpret result as it relates to your research question.