statistics midterm 2 review - amazon...

7
Statistics Midterm 2 Review Stephanie Oliveira 1 CHAPTER 6: The Normal Curve, Standardization and Z scores Normal Curve: a specific bell-shaped curve that is unimodal, symmetric and defined mathematically. Standardization, z Scores, and the Normal Curve Standardization: Converts individual scores from different normal 4. Transform z scored into percentiles that are more easily understood. Transforming z Scores into Percentiles. Z scores are useful because: 1. They give us a sense of where a score falls in relation to the mean of its population (in terms of the standard deviation of its population) 2. Z scores allow us to compare scores from different distributions 3. Z scores can be transformed into percentiles.

Upload: others

Post on 02-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

1

CHAPTER 6: The Normal Curve, Standardization and Z scores

Normal Curve: a specific bell-shaped curve that is unimodal, symmetric and defined mathematically.

Standardization, z Scores, and the Normal Curve

Standardization: Converts individual scores from different normal distributions to a shared normal distribution with a known mean, standard deviation, and percentiles.

The Need for Standardization

Z Score: the number of standard deviations a particular score is from the mean.

Transforming Raw Scores into Z Scores.

Transforming Z scored into Raw Scores

Z distribution: a normal distribution of standardized scores. Standard normal distribution: a normal distribution of z scores. The standardization distribution allows us to do the following:

1. Transform raw scores into standardized scores called z scores. 2. Transform z scored back into raw scores 3. Compare z scores to each other—even when the z scores represent

raw scores on different scales. 4. Transform z scored into percentiles that are more easily understood.

Transforming z Scores into Percentiles. Z scores are useful because:

1. They give us a sense of where a score falls in relation to the mean of its population (in terms of the standard deviation of its population)

2. Z scores allow us to compare scores from different distributions 3. Z scores can be transformed into percentiles.

z =(X - m)

s

X = z(s )+ m

Page 2: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

2

The Central Limit Theorem

Central limit Theorem: refers to how a distribution of sample means is a more normal distribution than a distribution of scores, even when the population distribution is not normal.

The central limit theorem demonstrates two important principles: 1. Repeated sampling of means approximated a normal curve, even

when the original population is not normally distributed. 2. A distribution of mean is less variable than a distribution of

individual scores. Distribution of means: a distribution composed of many means that are

calculated from all possible samples of a given size, all taken from the same population.

The mean of the sample means will be the same as the mean. Standard error: the name for the standard deviation of a distribution of

means. The standard error measures (roughly) the average difference between M

and μ that should occur by random sampling alone (i.e., roughly, the average value for M - µ).

Formula for standard error:

You can find sample means from z-scores using: Chapter 7: Hypothesis Testing with z Tests Hypothesis Testing with Z Tests

Z test: a hypothesis test in which we compare data from one sample to a population for which we know the mean and the standard deviation.

Given a score, Find a proportion a) Finding Proportions and Probabilities Step 1 – Sketch the normal distribution and shade the target area Step 2 – Choose the appropriate z-table column Step 3 – Convert X (raw score) to a Z-score Step 4 – Use the z-table to find the proportion (probability) of your raw score (X)

s M =s

n

M = µ + (z)(σM)

Page 3: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

3

Raw scores, z Scores, and Percentages

The z table allows us to translate the standardized z distribution into percentages and individual z scores into percentile ranks.

We can determine the percentage associated with a given z statistic by following two steps.

o Step 1: Convert raw score into a z score. o Step 2: Look up a given z score on the z table to find the percentage of

scores between the mean and that z score. Steps to finding percentages:

o Step 1: Convert the raw score to a z score o Step 2: Look up the z score on the z table to find the associated

percentage between the mean and the z score. o Once we know that the associated percentage is 33.65%, we can

determine a number of percentages related to the z score. Calculating the percentile for a positive z score: We add

50% to the percentage between the mean and that z score to get the total percentage below that z score.

Calculating the percentage above a positive z score: We subtract the percentage between the mean and that z score from 50% to get the percentage above that z score.

Calculating the percentage at least as extreme as our z score: For a positive z score, we double the percentage above that z score to get the percentage of scores that are at least as extreme

Calculating a score from a percentile: We can convert a percentile to a raw score by calculating the percentage between the mean and the z score, and looking up that percentage on the z table to find the associated z score. We would then convert the z score to a raw score using the formu

The Assumptions of Hypothesis Testing

Assumption: A characteristic that we ideally require the population from which we are sampling to have so that we can make accurate inferences.

Parametric Test: an inferential statistical analysis based on a set of assumptions about the population.

Nonparametric test: an inferential statistical analysis that is not based on a set of assumptions about the population.

Robust: hypothesis tests are those that produce fairly accurate results even when the data suggest that the population might not meet some of the assumptions.

Critical value: a test statistic value beyond which we rejects the null hypothesis (aka. Cutoff).

Page 4: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

4

Critical region: the area in the tails of the distribution in which we reject the null hypothesis if our test statistic falls there.

A finding is statistically significant if the data differ from what we would expect by chance if there were no actual difference

Assumptions: 1. The dependent variable is scale

DV measured using an interval or ratio (scale) measure 2. Sample is a random sample

Use random sampling to get a representative sample 3. Independent observations

Individual observations or measurements should not be affected by other observations

4. Normal sampling distribution Don’t need to worry about this due to the Central Limit Theorem Steps of the Hypothesis Test

1. Identify the populations, comparison distribution, the appropriate statistical test, and the assumptions of that test

2. State your hypotheses (in words and symbols). 3. Determine the characteristics of the comparison distribution

a) Mean & standard error for z-test 4. Locate the critical region (i.e., critical cutoffs)

a) Identify the alpha level and whether the test is one- or two-tailed b) Locate the boundaries of the critical region. c) State your decision rule.

5. Compute your test statistic (e.g., z-score) 6. Make a decision and interpret the result in terms of your independent and

dependent variables. Chapter 8: Confidence Intervals, Effect Size, and Statistical Power Confidence Intervals

Point estimate: a summary statistic from a sample that is just one number used as an estimate of the population parameter.

Interval estimate: based on a sample statistic and provides a range of plausible values for the population parameter

Confidence interval: an interval estimate, based on the sample statistic, that includes the population mean a certain percentage of the time, were we to sample from the same population repeatedly.

Calculating confidence intervals with z Distributions Step 1: Draw a picture of a distribution that will include the confidence interval.

Page 5: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

5

Step 2: indicate the bounds of the confidence interval on the drawing.

Step 3: Determine the z statistics that fall at each line marking the middle 95% Step 4: Turn the z statistic back into raw means

Mlower = - z(σM) + M Mupper = z(σM) + M Effect size and Prep The effect of sample size on statistical significance

Statistical significance (rejecting the null hypothesis) provides only limited information about what is happening in any particular situation. Tell you that 2 groups are different.

Indeed, statistical significance only tells you that a difference, at the population level, is not zero (i.e, the null hypothesis). Which is really not that helpful.

When a hypothesis tests rejects or fails to reject the Ho, we say there has been a significant effect. This does not mean there has been a substantial effect.

The statistical significance indicates that we know this result was unlikely to have occurred by due to random sampling/assignment. Therefore, the effect of the treatment/manipulation is not zero.

Practical Significance: Effect Size Effect size (e.g. Cohen’s d) is a way of estimating the size of a difference at

the population level. The amount of overlap between two distributions can be decreased in 2

ways: o If their means are farther apart o If the variation within each population is smaller

Cohen’s d

Cohen’s d: a measure of effect size that assesses the difference between two means in terms of standard deviation, not standard error.

Page 6: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

6

You need to classify the d-value you calculated as small, medium, or large.

Statistical power: a measure of our ability to reject the null hypothesis given

that the null hypothesis is false. One tailed tests have more power. Type 2 errors

Power: When there is a mean change or difference at the _population level & we fail to reject H0.

Due to random sampling, sample level results may not always reflect what is true at the population level. Consequently, when we make decisions based on samples, we can make errors. That is, the conclusion we make based on our sample could be incorrect.

Decision based on Sample

What is True at the Population Level

Fail to Reject Null Hypothesis Concluding no mean change or difference

Reject Null Hypothesis Concluding mean change or differences exists

Mean change or difference exists

Type II Error

Correct

3 Factors Affecting Power

Larger sample size increases power One-tailed tests have more power than two-tailed tests

Md

Page 7: Statistics Midterm 2 Review - Amazon S3s3.amazonaws.com/prealliance_oneclass_sample/POgPrXA8KL.pdf · Statistics Midterm 2 Review Stephanie Oliveira 4 Critical region: the area in

Statistics Midterm 2 Review

Stephanie Oliveira

7

Power increases as alpha level increases (Alpha is somewhat, but not really, in your control since journals always use p<.05 to determine significance)