intro to statistics for the behavioral sciences psyc 1900 lecture 5: probability and hypothesis...

31
Intro to Statistics for Intro to Statistics for the Behavioral Sciences the Behavioral Sciences PSYC 1900 PSYC 1900 Lecture 5: Probability Lecture 5: Probability and Hypothesis Testing and Hypothesis Testing

Post on 20-Dec-2015

221 views

Category:

Documents


3 download

TRANSCRIPT

Intro to Statistics for the Intro to Statistics for the Behavioral SciencesBehavioral Sciences

PSYC 1900PSYC 1900

Lecture 5: Probability and Lecture 5: Probability and Hypothesis TestingHypothesis Testing

ProbabilityProbability Relative Frequency PerspectiveRelative Frequency Perspective

Probability of some event is the limit of the Probability of some event is the limit of the relative frequency of occurrence as the relative frequency of occurrence as the number of draws (i.e., samples) approaches number of draws (i.e., samples) approaches infinity.infinity.

If we have 8 blue marbles and 2 red marbles, If we have 8 blue marbles and 2 red marbles, the probability of drawing a red = 2/10 = the probability of drawing a red = 2/10 = 20% on any trial (i.e., analytic perspective).20% on any trial (i.e., analytic perspective).

Across repeated trials, we would find that Across repeated trials, we would find that 20% of them produce a red marble.20% of them produce a red marble.

Note that we’re sampling with replacement.Note that we’re sampling with replacement.

TerminologyTerminology Sampling with replacementSampling with replacement

After an event, the draw or event goes back into the After an event, the draw or event goes back into the pool.pool.

Sampling in which an item drawn on trial N is replaced Sampling in which an item drawn on trial N is replaced before the drawing of the N+1 trial.before the drawing of the N+1 trial.

EventEvent The outcome of a trialThe outcome of a trial

Independent eventsIndependent events Events where the occurrence of one has no effect on the Events where the occurrence of one has no effect on the

probability of the occurrence of othersprobability of the occurrence of others Voting behavior of random citizens, marble drawVoting behavior of random citizens, marble draw

Mutually exclusive eventsMutually exclusive events Two events are mutually exclusive when the occurrence Two events are mutually exclusive when the occurrence

of one precludes the occurrence of the other.of one precludes the occurrence of the other. Gender, religion, handednessGender, religion, handedness

Basic Laws of ProbabilityBasic Laws of Probability

Probabilities range from 0 to 1, where a 1 Probabilities range from 0 to 1, where a 1 means the event must occur.means the event must occur.

Additive RuleAdditive Rule Gives probs of occurrence for one or more Gives probs of occurrence for one or more

mutually exclusive envents.mutually exclusive envents. 30 red marbles, 15 blue, 55 green = 100 total30 red marbles, 15 blue, 55 green = 100 total p(red)=.30, p(blue)=.15, p(green) = .55p(red)=.30, p(blue)=.15, p(green) = .55 Probability of drawing a red or blue?Probability of drawing a red or blue? Given a set of mutually exclusive events, the Given a set of mutually exclusive events, the

probability of one event or the other equals the probability of one event or the other equals the sum of their separate probabilities.sum of their separate probabilities.

p(red)=.30 + p(blue)=.15=.45p(red)=.30 + p(blue)=.15=.45

Basic Laws of ProbabilityBasic Laws of Probability Multiplicative LawMultiplicative Law

Gives the probability of the joint occurrence of Gives the probability of the joint occurrence of independent events.independent events.

30 red marbles, 15 blue, 55 green = 100 total30 red marbles, 15 blue, 55 green = 100 total p(red)=.30, p(blue)=.15, p(green) = .55p(red)=.30, p(blue)=.15, p(green) = .55 Probability of drawing a red on the first trial Probability of drawing a red on the first trial

and a red on the second?and a red on the second? The prob of a joint occurrence of two or more The prob of a joint occurrence of two or more

independent events equals the product of their independent events equals the product of their individual probabilities.individual probabilities.

p(red) X p(red) = .3X.3 = .09p(red) X p(red) = .3X.3 = .09

Sequence of coin flips:Sequence of coin flips:H,H,T,H,T,T,T,H,T,T, __H,H,T,H,T,T,T,H,T,T, __

What is the probability of H on next draw?What is the probability of H on next draw?Prob=.5 Events are independentProb=.5 Events are independent

What is the probability of H and H on the What is the probability of H and H on the next two draws?next two draws?

Prob=.5X.5=.25 Events are independentProb=.5X.5=.25 Events are independentConditional probability Conditional probability

of of independent eventsindependent events

Joint ProbabilitiesJoint Probabilities

The probability of the co-occurrence of The probability of the co-occurrence of two or more eventstwo or more events Probability of sampling a red cube from a Probability of sampling a red cube from a

sample of red and blue marbles and cubessample of red and blue marbles and cubes p(red,cube) = p(red) x p(cube)p(red,cube) = p(red) x p(cube)

If the events are independentIf the events are independent If not independent (i.e., a correlation among If not independent (i.e., a correlation among

events), computation of prob is more complexevents), computation of prob is more complex

Conditional ProbabilitiesConditional Probabilities

The prob of one even given the The prob of one even given the occurrence of another eventoccurrence of another event The prob that a person will fracture a The prob that a person will fracture a

bone given that he/she has osteoporosisbone given that he/she has osteoporosis p(fracture|osteoporosis) = Yp(fracture|osteoporosis) = Y If the null hypothesis is true, the If the null hypothesis is true, the

probability of obtaining a difference probability of obtaining a difference between sample means of X sizebetween sample means of X size

p(no fracture) p(no fracture) =258/358=.72=258/358=.72

p(norm den, no p(norm den, no frac)=153/358=.43frac)=153/358=.43 Why not p(norm) x Why not p(norm) x

p(no frac) p(no frac) = .49x.72=.35?= .49x.72=.35?

p(frac|osteo) = .42; p(frac|osteo) = .42; p(frac|norm)=.14p(frac|norm)=.14

Other conditional Other conditional prob examples?prob examples?

Bone Density No Fracture Fracture Total

Normal 153 24 177

Row% 86 14 49%

Column% 59 24

Cell% 43 7

Osteoporosis 105 76 181

Row% 58 42 51%

Column% 41 76

Cell% 29 21

Total 258 100

Column% 72% 28%

Discrete vs. ContinuousDiscrete vs. Continuous Probability Distributions Probability Distributions

For discrete For discrete distributions, we distributions, we can calculate probs can calculate probs for specific events.for specific events. p(Harvard, vanilla) p(Harvard, vanilla)

= =

7/20=.357/20=.35

Discrete vs. ContinuousDiscrete vs. Continuous Probability Distributions Probability Distributions

For continuous distributions, case is For continuous distributions, case is slightly different.slightly different. Prob that baby will crawl at 35 weeks?Prob that baby will crawl at 35 weeks? Almost zero at 35.00001 weeks.Almost zero at 35.00001 weeks. Events at a very specific point are infrequent.Events at a very specific point are infrequent. Density gives probability for specific rangeDensity gives probability for specific range

35 weeks means from 34.5 to 35.5 weeks.35 weeks means from 34.5 to 35.5 weeks. Integrate to find area under curve which provides a Integrate to find area under curve which provides a

probability as a function of proportion of interval area probability as a function of proportion of interval area to entire area under curve (where total area is set to to entire area under curve (where total area is set to equal 1)equal 1)

Sampling Distributions & Sampling Distributions & Hypothesis TestingHypothesis Testing

Until now, we have primarily focused on Until now, we have primarily focused on descriptive statistics.descriptive statistics.

Although such statistics are quite useful for Although such statistics are quite useful for assessing the characteristics of samples, assessing the characteristics of samples, they cannot answer questions related to they cannot answer questions related to inference.inference. Is the difference between two means likely to Is the difference between two means likely to

represent chance variation?represent chance variation? To answer such questions, the remainder of To answer such questions, the remainder of

this course will focus on the statistical this course will focus on the statistical process of inference.process of inference.

Basic Form of InferenceBasic Form of Inference

The most basic question is one in The most basic question is one in which we might compare the means which we might compare the means of two groups.of two groups.

If one group has a mean of 50 and If one group has a mean of 50 and the other a mean of 42 following the other a mean of 42 following some manipulation, can we infer that some manipulation, can we infer that the manipulation lowered the score?the manipulation lowered the score?

Sampling ErrorSampling Error To answer this To answer this

question, we have to question, we have to understand understand sampling sampling errorerror..

Sampling error is the Sampling error is the variability of a statistic variability of a statistic from sample to sample from sample to sample due to chance.due to chance.

If I took samples from a If I took samples from a population, the population, the descriptives of the descriptives of the samples would cluster samples would cluster around, but not always around, but not always equal the parameters equal the parameters of the population.of the population.

X=47sd=9

X=52sd=10

X=51sd=12

X=49sd=9

Hypothesis TestingHypothesis Testing

The basic question in hypothesis The basic question in hypothesis testing is:testing is: Is the given difference large enough that it Is the given difference large enough that it

does not likely stem from sampling error?does not likely stem from sampling error?

Hypothesis TestingHypothesis Testing A process by which decisions are made A process by which decisions are made

regarding the values of parameters.regarding the values of parameters.

Sampling DistributionsSampling Distributions The distribution of a The distribution of a

statistic over repeated statistic over repeated sampling from a sampling from a specified population.specified population.

Both descriptive and Both descriptive and inferential statistics inferential statistics (e.g., t, F, r) have (e.g., t, F, r) have sampling distributions.sampling distributions.

Tell us what values we Tell us what values we might expect given might expect given certain conditions.certain conditions. A conditional probabilityA conditional probability

X=47sd=9

X=52sd=10

X=51sd=12

X=49sd=9

Sampling Distribution of the Sampling Distribution of the MeanMean

To determine if the difference between To determine if the difference between two means is likely due to sampling error, two means is likely due to sampling error, we need to know the sd of a distribution of we need to know the sd of a distribution of means from the population.means from the population.

Standard Error of the MeanStandard Error of the Mean sd of a sampling distribution of meanssd of a sampling distribution of means Sampling distritribution of the mean is the Sampling distritribution of the mean is the

distribution of means collected from repeated distribution of means collected from repeated sampling of the same population.sampling of the same population.

Distribution of Sample Distribution of Sample MeansMeans

Mean Number Aggressive Associates

7.257.00

6.756.50

6.256.00

5.755.50

5.255.00

4.754.50

4.254.00

3.75

Sampling Distribution

Number of Aggressive Associates

Freq

uenc

y

1400

1200

1000

800

600

400

200

0

Std. Dev = .45

Mean = 5.65

N = 10000.00

Hypothesis TestingHypothesis Testing

Sampling distributions allow us to test Sampling distributions allow us to test hypotheses.hypotheses. Sampling distributions can be derived Sampling distributions can be derived

mathematically.mathematically.

If the aggression mean of kids viewing a If the aggression mean of kids viewing a violent video is 6.5, and the “normal” violent video is 6.5, and the “normal” population mean for kids is 5.65, does this population mean for kids is 5.65, does this difference imply that the such videos difference imply that the such videos increase aggressive thoughts?increase aggressive thoughts?

Logic of Hypothesis TestingLogic of Hypothesis Testing

Set up relevant null hypothesis [HSet up relevant null hypothesis [H00]] Sample (i.e., kids who watch violent videos) represents Sample (i.e., kids who watch violent videos) represents

same population. same population. Mean should equal population mean of 5.65Mean should equal population mean of 5.65

Calculate mean of sampleCalculate mean of sample Mean = 6.5Mean = 6.5

Obtain sampling distribution and standard errorObtain sampling distribution and standard error Determine probability of obtaining a mean at Determine probability of obtaining a mean at

least as large as the actual sample meanleast as large as the actual sample mean On that basis, decide whether to accept or reject On that basis, decide whether to accept or reject

the null hypothesisthe null hypothesis

The Null HypothesisThe Null Hypothesis

At its heart, the null states that At its heart, the null states that parameters are the same.parameters are the same. For example, 2 means are equalFor example, 2 means are equal The difference between the means is zeroThe difference between the means is zero

Any differences reflect sampling errorAny differences reflect sampling error

Why use the null?Why use the null? Excellent starting placeExcellent starting place What would the alternative be?What would the alternative be?

We’d have to specify sampling distributions for exact We’d have to specify sampling distributions for exact alternative parameter values?alternative parameter values?

Test Statistics and Sampling Test Statistics and Sampling DistributionsDistributions

The same logic applies to test statistics as The same logic applies to test statistics as well as means.well as means. t’s, F’s, r’st’s, F’s, r’s A sampling distribution can be calculated for A sampling distribution can be calculated for

each statistic and used to evaluate the each statistic and used to evaluate the corresponding null.corresponding null.

For t, a sampling distribution when HFor t, a sampling distribution when H00 is true would is true would consist of t values from an infinite number of paired consist of t values from an infinite number of paired samples.samples.

Compare current t to sampling distribution to Compare current t to sampling distribution to determine viability of null.determine viability of null.

Using Normal Distribution to Using Normal Distribution to Test HypothesesTest Hypotheses

The normal distribution can be used to test The normal distribution can be used to test hypotheses involving individual scores or hypotheses involving individual scores or sample means.sample means. Assumes scores or sampling distributions of Assumes scores or sampling distributions of

the mean are normally distributedthe mean are normally distributed

Going back to our example:Going back to our example: Mean of kids watching violent videos = 6.5Mean of kids watching violent videos = 6.5 Population parametersPopulation parameters

Mean = 5.65, sd = .45Mean = 5.65, sd = .45

Using Normal Distribution to Using Normal Distribution to Test HypothesesTest Hypotheses

Convert 6.5 to a z scoreConvert 6.5 to a z score

appletapplet p(6.5|N(5.65,0.45))=.06p(6.5|N(5.65,0.45))=.06

6.5 5.651.89

.45z

TerminologyTerminology

Significance LevelSignificance Level Probability with which we are willing to reject Probability with which we are willing to reject

null when it is in fact correctnull when it is in fact correct Also called alpha levelAlso called alpha level

Rejection RegionRejection Region Set of outcomes that will lead to rejection of nullSet of outcomes that will lead to rejection of null

Alternative HypothesisAlternative Hypothesis Hypothesis that is adopted when null is rejectedHypothesis that is adopted when null is rejected Usually the research hypothesis:Usually the research hypothesis:

5.65X

Type I and Type II ErrorsType I and Type II Errors

As we’ve seen, determining whether As we’ve seen, determining whether a difference is “real” or due to a difference is “real” or due to sampling error requires a choice of a sampling error requires a choice of a critical value or significance level.critical value or significance level.

Because we are making a choice, Because we are making a choice, there is always the chance that the there is always the chance that the choice will be incorrect.choice will be incorrect.

Type I and Type II ErrorsType I and Type II Errors

If we use a significance level of .05If we use a significance level of .05 5% of the time we will reject the null 5% of the time we will reject the null

hypothesis when it is truehypothesis when it is true Type I ErrorType I Error p(Type I) = alphap(Type I) = alpha

If we feel this amount of error is too If we feel this amount of error is too large, what can we do to minimize large, what can we do to minimize Type I errors?Type I errors?

Type I and Type II ErrorsType I and Type II Errors

Use a more stringent alpha level to Use a more stringent alpha level to reduce Type I errorsreduce Type I errors Alpha = .01; only 1% error in rejecting Alpha = .01; only 1% error in rejecting

nullnull This strategy has a trade-offThis strategy has a trade-off

Failing to reject the null when it is Failing to reject the null when it is false is a Type II errorfalse is a Type II error p(Type II) = betap(Type II) = beta

Decision True State Of World

Null True Null False

Reject Null Type 1 Error Correct Decision

Fail to Reject Null Correct Decision Type II Error

p 1p Power

1p p

One-Tailed vs. Two-Tailed One-Tailed vs. Two-Tailed TestsTests

Two-tailed (nondirectional) tests are Two-tailed (nondirectional) tests are most commonmost common Look for extremes in both tails (i.e., Look for extremes in both tails (i.e.,

positive or negative deviations from the positive or negative deviations from the mean)mean)

Alpha = .05 has .025 null rejection area in Alpha = .05 has .025 null rejection area in each tail of sampling distributioneach tail of sampling distribution

Used because one might never truly be Used because one might never truly be sure what outcome to expectsure what outcome to expect

One-Tailed vs. Two-Tailed One-Tailed vs. Two-Tailed TestsTests

One-tailed (directional tests) are less One-tailed (directional tests) are less commonly usedcommonly used Look for extreme parameter values in Look for extreme parameter values in

only 1 tailonly 1 tail Researcher predicts direction of Researcher predicts direction of

differencedifference Alpha=.05 places total .05 null rejection Alpha=.05 places total .05 null rejection

area in a single tailarea in a single tail What is the benefit in terms of power?What is the benefit in terms of power?

Smaller differences will be viewed as Smaller differences will be viewed as significant due to increased null rejection significant due to increased null rejection areaarea