p-values are random variables - · pdf filep-values are random variables outline 1 motivation...

48
P-values are Random Variables P-values are Random Variables Duncan Murdoch Department of Statistical and Actuarial Sciences University of Western Ontario October 4, 2007 1 of 29

Upload: hoangdieu

Post on 06-Mar-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

P-values are Random Variables

Duncan Murdoch

Department of Statistical and Actuarial SciencesUniversity of Western Ontario

October 4, 2007

1 of 29

Page 2: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Outline

1 Motivation

2 What are p-values?

3 How should we teach them?

4 Examples

This is joint work with Yu-Ling Tsai and James Adcock.

2 of 29

Page 3: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Outline

1 Motivation

2 What are p-values?

3 How should we teach them?

4 Examples

3 of 29

Page 4: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Teaching introductory statistics

I’ve been teaching hypothesis testing in introductorystatistics courses since 1988.Over time I have gradually changed the way I teachhypothesis testing and p-values; this talk describes mycurrent ideas.A few recent events triggered the urge to write this up...

4 of 29

Page 5: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Teaching introductory statistics

I’ve been teaching hypothesis testing in introductorystatistics courses since 1988.Over time I have gradually changed the way I teachhypothesis testing and p-values; this talk describes mycurrent ideas.A few recent events triggered the urge to write this up...

4 of 29

Page 6: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Teaching introductory statistics

I’ve been teaching hypothesis testing in introductorystatistics courses since 1988.Over time I have gradually changed the way I teachhypothesis testing and p-values; this talk describes mycurrent ideas.A few recent events triggered the urge to write this up...

4 of 29

Page 7: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

A trigger

On the R-help list in May 2006, regarding inconsistent results(p = 0.7767,p = 0.9059,p = 0.1887) when running a normalitytest on randomly generated data:

I mistakenly had thought the p-values would be morestable since I am artificially creating a random normaldistribution. Is this expected for a normality test or isthis an issue with how rnorm is producing randomnumbers? I guess if I run it many times, I would findthat I would get many large values for the p-value?– Name withheld

5 of 29

Page 8: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

A trigger

On the R-help list in May 2006, regarding inconsistent results(p = 0.7767,p = 0.9059,p = 0.1887) when running a normalitytest on randomly generated data:

I mistakenly had thought the p-values would be morestable since I am artificially creating a random normaldistribution. Is this expected for a normality test or isthis an issue with how rnorm is producing randomnumbers? I guess if I run it many times, I would findthat I would get many large values for the p-value?– Name withheld

5 of 29

Page 9: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

A response

Discussion followed on why this was not a reasonableexpectation, including this:

We see this misunderstanding worryingly often.Worrying because it reveals that a fundamental aspectof statistical inference has not been grasped: thatp-values are designed to be (approximately) uniformlydistributed and fall below any given level with thestated probability, when the null hypothesis is true.– Peter Dalgaard

6 of 29

Page 10: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

A second trigger

At her thesis defence, Yu-Ling presented histograms ofsimulated p-values to illustrate deficiencies in some asymptoticapproximations:

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Score Regularized Score Gamma

Fitt

ed m

argi

nsOne of the examiners questioned this way of presenting theresults.

7 of 29

Page 11: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Advice on the web

On a medical school research methods course web page:

The t-test value for the stress test indicates that theprobability that the null hypothesis is true is smallerthan one-in-twenty.

I pointed out that this isn’t correct, and received the response:

[This] is written the way it is to give students a way tomake decisions about statistical results in journalarticles. It is not for people learning about statistics.Thus, the interpretation of p-values is correct enough.

8 of 29

Page 12: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Advice on the web

On a medical school research methods course web page:

The t-test value for the stress test indicates that theprobability that the null hypothesis is true is smallerthan one-in-twenty.

I pointed out that this isn’t correct,

and received the response:

[This] is written the way it is to give students a way tomake decisions about statistical results in journalarticles. It is not for people learning about statistics.Thus, the interpretation of p-values is correct enough.

8 of 29

Page 13: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Motivation

Advice on the web

On a medical school research methods course web page:

The t-test value for the stress test indicates that theprobability that the null hypothesis is true is smallerthan one-in-twenty.

I pointed out that this isn’t correct, and received the response:

[This] is written the way it is to give students a way tomake decisions about statistical results in journalarticles. It is not for people learning about statistics.Thus, the interpretation of p-values is correct enough.

8 of 29

Page 14: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

Outline

1 Motivation

2 What are p-values?

3 How should we teach them?

4 Examples

9 of 29

Page 15: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

The definition of a p-value

Given a null hypothesis H0, an alternative H1, and a teststatistic T , the p-value is

the probability, computed assuming that H0 is true,that the test statistic would take a value as extreme ormore extreme than that actually observed.– Moore, D. S. (2007), The Basic Practice of Statistics

In the typical case where large values of T are considered to beextreme, this is

p = P(T ≥ tobs|H0)

10 of 29

Page 16: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

Interpretation of a p-value

How should we interpret p?

the smaller the p-value, the stronger the evidenceagainst H0 provided by the data.– Moore, D. S. (2007), The Basic Practice of Statistics

11 of 29

Page 17: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

How are p-values interpreted in the wild?

The definition is p = P(T ≥ tobs|H0). Some commonmisconceptions (from Wikipedia):

1 the probability that the null hypothesis is true, i.e.P(H0|data).

2 the probability that a finding is “merely a fluke”.3 the probability of falsely rejecting the null hypothesis, i.e.

P[H0 ∩ (tobs ≥ tcrit)].4 the probability that a replicating experiment would not yield

the same conclusion.

12 of 29

Page 18: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

How are p-values interpreted in the wild?

The definition is p = P(T ≥ tobs|H0). Some commonmisconceptions (from Wikipedia):

1 the probability that the null hypothesis is true, i.e.P(H0|data).

2 the probability that a finding is “merely a fluke”.3 the probability of falsely rejecting the null hypothesis, i.e.

P[H0 ∩ (tobs ≥ tcrit)].4 the probability that a replicating experiment would not yield

the same conclusion.

12 of 29

Page 19: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

How are p-values interpreted in the wild?

The definition is p = P(T ≥ tobs|H0). Some commonmisconceptions (from Wikipedia):

1 the probability that the null hypothesis is true, i.e.P(H0|data).

2 the probability that a finding is “merely a fluke”.3 the probability of falsely rejecting the null hypothesis, i.e.

P[H0 ∩ (tobs ≥ tcrit)].4 the probability that a replicating experiment would not yield

the same conclusion.

12 of 29

Page 20: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

What are p-values?

How are p-values interpreted in the wild?

The definition is p = P(T ≥ tobs|H0). Some commonmisconceptions (from Wikipedia):

1 the probability that the null hypothesis is true, i.e.P(H0|data).

2 the probability that a finding is “merely a fluke”.3 the probability of falsely rejecting the null hypothesis, i.e.

P[H0 ∩ (tobs ≥ tcrit)].4 the probability that a replicating experiment would not yield

the same conclusion.

12 of 29

Page 21: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

Outline

1 Motivation

2 What are p-values?

3 How should we teach them?

4 Examples

13 of 29

Page 22: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

How do we teach confidence intervals?

Definitions are awkward:A level C confidence interval for a parameter is aninterval computed from sample data by a method thathas probability C of producing an interval containingthe true value of the parameter.–Moore and McCabe (2003), Introduction to thePractice of Statistics

14 of 29

Page 23: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

But a picture tells the story...

1.0 1.5 2.0 2.5 3.0

510

1520

Interval

Sim

ulat

ion

num

ber

15 of 29

Page 24: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

We should emphasize that p-values are randomvariables

Start by saying the p-value is simply a transformation ofthe test statistic.If the audience has enough mathematical sophistication,give a formula:

p = 1− F (tobs)

where F (·) is the CDF of T under H0.Show (or state) that this results in p ∼ Unif(0,1) under H0.Mention that a good T will tend to be larger under H1, so pwill be smaller.THEN give Moore’s statement, as one justification for thisdefinition.

16 of 29

Page 25: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

We should emphasize that p-values are randomvariables

Start by saying the p-value is simply a transformation ofthe test statistic.If the audience has enough mathematical sophistication,give a formula:

p = 1− F (tobs)

where F (·) is the CDF of T under H0.Show (or state) that this results in p ∼ Unif(0,1) under H0.Mention that a good T will tend to be larger under H1, so pwill be smaller.THEN give Moore’s statement, as one justification for thisdefinition.

16 of 29

Page 26: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

We should emphasize that p-values are randomvariables

Start by saying the p-value is simply a transformation ofthe test statistic.If the audience has enough mathematical sophistication,give a formula:

p = 1− F (tobs)

where F (·) is the CDF of T under H0.Show (or state) that this results in p ∼ Unif(0,1) under H0.Mention that a good T will tend to be larger under H1, so pwill be smaller.THEN give Moore’s statement, as one justification for thisdefinition.

16 of 29

Page 27: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

We should emphasize that p-values are randomvariables

Start by saying the p-value is simply a transformation ofthe test statistic.If the audience has enough mathematical sophistication,give a formula:

p = 1− F (tobs)

where F (·) is the CDF of T under H0.Show (or state) that this results in p ∼ Unif(0,1) under H0.Mention that a good T will tend to be larger under H1, so pwill be smaller.THEN give Moore’s statement, as one justification for thisdefinition.

16 of 29

Page 28: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

We should emphasize that p-values are randomvariables

Start by saying the p-value is simply a transformation ofthe test statistic.If the audience has enough mathematical sophistication,give a formula:

p = 1− F (tobs)

where F (·) is the CDF of T under H0.Show (or state) that this results in p ∼ Unif(0,1) under H0.Mention that a good T will tend to be larger under H1, so pwill be smaller.THEN give Moore’s statement, as one justification for thisdefinition.

16 of 29

Page 29: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

How should we teach them?

Show pictures!

P-values are random variables, so it is natural to study theirdistribution by simulation.

10000 p−values under H0

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

10000 p−values under H1

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

Histograms are easily understood.

17 of 29

Page 30: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Outline

1 Motivation

2 What are p-values?

3 How should we teach them?

4 Examples

18 of 29

Page 31: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

One-sample t-test

Data X1, . . . ,X4 ∼ N(µ, σ2) i.i.d.Hypotheses H0 : µ = 0 versus H1 : µ > 0Test statistic T = X̄

s/√

4∼ t(3) under H0

µµ == 0

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

02

46

8

µµ == 0.5

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

02

46

8

µµ == 1

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.00

24

68

19 of 29

Page 32: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Composite null hypotheses

When H0 is composite, it may not uniquely determine thedistribution.Hypotheses H0 : µ ≤ 0 versus H1 : µ > 0

µµ == −− 0.5

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

µµ == 0

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

µµ == 0.5

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.00

12

34

20 of 29

Page 33: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Violations of assumptions

If our assumptions are violated, the null distribution of pmay be distorted, but larger samples often improve theapproximations.Example: Assume data are N(µ, σ2), but they reallyExponential(1).

1 One-sample t-test, H0 : µ = 1 versus H1 : µ 6= 12 Two-sample t-test, H0 : µ1 = µ2 versus H1 : µ1 6= µ2

Note that H0 is true in both cases. Let’s look at the nulldistributions.

21 of 29

Page 34: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Violations of assumptions

If our assumptions are violated, the null distribution of pmay be distorted, but larger samples often improve theapproximations.Example: Assume data are N(µ, σ2), but they reallyExponential(1).

1 One-sample t-test, H0 : µ = 1 versus H1 : µ 6= 12 Two-sample t-test, H0 : µ1 = µ2 versus H1 : µ1 6= µ2

Note that H0 is true in both cases. Let’s look at the nulldistributions.

21 of 29

Page 35: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Violations of assumptions

If our assumptions are violated, the null distribution of pmay be distorted, but larger samples often improve theapproximations.Example: Assume data are N(µ, σ2), but they reallyExponential(1).

1 One-sample t-test, H0 : µ = 1 versus H1 : µ 6= 12 Two-sample t-test, H0 : µ1 = µ2 versus H1 : µ1 6= µ2

Note that H0 is true in both cases. Let’s look at the nulldistributions.

21 of 29

Page 36: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

One sample t−test with n=2

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

One sample t−test with n=10

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

Two sample t−test with n=2

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

Two sample t−test with n=10

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

22 of 29

Page 37: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Discrete data

With discrete data, p-values inherit a discrete distribution.You won’t see Unif(0,1) under the null.This makes display of simulated p-values harder, but theempirical CDF is not too bad.

23 of 29

Page 38: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Discrete data

With discrete data, p-values inherit a discrete distribution.You won’t see Unif(0,1) under the null.This makes display of simulated p-values harder, but theempirical CDF is not too bad.

23 of 29

Page 39: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Test for independence in a 2× 2 table

Example from Tamhane and Dunlop (2000), Statistics and DataAnalysis

Success Failure TotalPrednisone 14 7 21

Prednisone + VCR 38 4 42Total 52 11 63

Pearson’s chi-square p-value: 0.04608Fisher’s exact p-value: 0.03232

What are the null distributions like?

24 of 29

Page 40: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Null tables with fixed margins.

Pearson's test

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

04

8

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

p−values

Pro

port

ion

Fisher's test

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

02

4

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

p−values

Pro

port

ion

25 of 29

Page 41: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Null tables with independent rows, P(success) = 52/63

Pearson's test

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

02

4

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

p−values

Pro

port

ion

Fisher's test

p−values

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

02

46

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

p−values

Pro

port

ion

26 of 29

Page 42: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Other examples

Explore robustness in other situations where theassumptions are violated. Look for the effect of violationson the power of the test.Study Welch’s correction for unequal variances in atwo-sample t-test. What happens when the variances areequal? What happens if we do not use it when we should?Show Monte Carlo p-values when the null distribution isonly available by simulation. Explore bootstrap tests.Explore other asymptotic approximations by studying thedistributions of nominal p-values.

27 of 29

Page 43: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Other examples

Explore robustness in other situations where theassumptions are violated. Look for the effect of violationson the power of the test.Study Welch’s correction for unequal variances in atwo-sample t-test. What happens when the variances areequal? What happens if we do not use it when we should?Show Monte Carlo p-values when the null distribution isonly available by simulation. Explore bootstrap tests.Explore other asymptotic approximations by studying thedistributions of nominal p-values.

27 of 29

Page 44: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Other examples

Explore robustness in other situations where theassumptions are violated. Look for the effect of violationson the power of the test.Study Welch’s correction for unequal variances in atwo-sample t-test. What happens when the variances areequal? What happens if we do not use it when we should?Show Monte Carlo p-values when the null distribution isonly available by simulation. Explore bootstrap tests.Explore other asymptotic approximations by studying thedistributions of nominal p-values.

27 of 29

Page 45: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Other examples

Explore robustness in other situations where theassumptions are violated. Look for the effect of violationson the power of the test.Study Welch’s correction for unequal variances in atwo-sample t-test. What happens when the variances areequal? What happens if we do not use it when we should?Show Monte Carlo p-values when the null distribution isonly available by simulation. Explore bootstrap tests.Explore other asymptotic approximations by studying thedistributions of nominal p-values.

27 of 29

Page 46: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Still more examples

In multiple testing, illustrate the distribution of the smallestof n p-values, and the distribution of Bonferroni-correctedp-values.Storey and Tibshirani (2003) used histograms of p-valuesin a collection of genomewide tests in order to illustratefalse discovery rate calculations.

Density of observed p−values

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

28 of 29

Page 47: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Examples

Still more examples

In multiple testing, illustrate the distribution of the smallestof n p-values, and the distribution of Bonferroni-correctedp-values.Storey and Tibshirani (2003) used histograms of p-valuesin a collection of genomewide tests in order to illustratefalse discovery rate calculations.

Density of observed p−values

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

28 of 29

Page 48: P-values are Random Variables - · PDF fileP-values are Random Variables Outline 1 Motivation 2 What are p-values? 3 How should we teach them? 4 Examples This is joint work with Yu-Ling

P-values are Random Variables

Conclusion

Many students end up with fallacious interpretations ofp-values, e.g. P(H0|data).We should look at histograms (or ECDF plots) of p-valuesfrom simulations.P-values are random variables!

29 of 29