statistics : the ten main mistakes

26
Statistics : the ten main mistakes Didier Concordet [email protected] Ecole Nationale Vétérinaire de Toulouse July 2005

Upload: nash

Post on 19-Feb-2016

42 views

Category:

Documents


3 download

DESCRIPTION

Ecole Nationale Vétérinaire de Toulouse. Statistics : the ten main mistakes. Didier Concordet [email protected]. July 2005. Statistical mistakes are frequent. • Many surveys of statistical errors in the medical literature - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistics : the ten main mistakes

Statistics : the ten main mistakes

Didier [email protected]

Ecole NationaleVétérinairede Toulouse

July 2005

Page 2: Statistics : the ten main mistakes

2

Statistical mistakes are frequent

• Many surveys of statistical errors in the medical literaturewith error rates ranging from 30%-90% (Altman, 1991; Gore et. al.,1976; Pocock et. al., 1987 and MacArthur, 1984)

• Reviews of the biomedical literature have consistently found that about half the articles use incorrect statistical methods (Glantz, 1980)

Page 3: Statistics : the ten main mistakes

3

When do they occur ?

• When designing the experiment• When collecting data• When analysing data• When interpreting results

Page 4: Statistics : the ten main mistakes

4

Design

• Lack of a proper randomisation the inference space is not definedpoor balance of the groups to be comparedlack of control group (maybe les frequent now)there exist confounding factors

• Lack of power the sample size is not large enough to answer the questionthe statistical unit is not well defined

Page 5: Statistics : the ten main mistakes

5

Inference space definition (M1)An experiment in 2 years old beagles showed that the temperature of dogs treated with the antipyretic drug A decreased by 2 °C.

Does this result still hold forall 2 years old beagles3 years olds beaglesbeaglesdogsman

Page 6: Statistics : the ten main mistakes

6

Poor balance (M2)Clinical trial

comparison of 2 antipyreticsrectal temperature after treatment

X = 39N = 100SD = 1

REFERENCE

X = 37N = 100SD = 1

New TRT

Reference < New TRT (P<0.001)

Page 7: Statistics : the ten main mistakes

7

Poor balanceClinical trial

comparison of 2 antipyreticsrectal temperature after treatment

Clinical trial 1

X = 40N = 90SD = 1

REFERENCE

X = 42N = 50SD = 1

New TRT

New TRT< RefP<0.001

Clinical trial 2

X = 30N = 10SD = 1

REFERENCE

X = 32N = 50SD = 1

New TRT

New TRT < RefP<0.001

Conclusion : Reference > New TRT

Page 8: Statistics : the ten main mistakes

8

Power (M3)

A clinical study to compare efficacy of two treatments (Ref. and Test)

Expected difference between the treatments = 4SD 2.

For the efficacy variable

A parallel two groups design is planned with 5 dogs in each groups

What to think about this study ?

35 % of power for a type I risk of 5%Even if the expected difference exists, only 35% of the samples (of size 5)of dogs actually exhibits it !

Page 9: Statistics : the ten main mistakes

9

PowerEfficacy variable on two groups of dogs

Ref Test

Mean 15.4SD 2.4

20.02.6

N 5 5

Student t-test :P = 0.18Actually no conclusion

Page 10: Statistics : the ten main mistakes

10

A real storyA study was performed in order to study the effect of diet on several biochemical compounds (about 20).

To this end, a dog was fed with a "normal" diet during 3 months and then with the new diet during 3 months.

Every two days, a blood sample was taken and the biochemical compounds were dosed.

At the end of the experiment 90 data were available for each biochemical compound.

There was a significant difference between the effects of the two diets for 10 biochemical compounds (P<0.001).

This result was obtained with a sample size of 90

Page 11: Statistics : the ten main mistakes

11

Statistical unit (M4)

The statistical unit (an individual) is a statistical object that cannot be divided.

We want to generalise results obtained on a finite collection of units (a sample) to a population of units.

Despite the appearance of "wealth", the sample size was equal to 1 not 90.At the end of the experiment, the only dog of the experiment was well known but what about the other dogs of the population ?

Page 12: Statistics : the ten main mistakes

12

Experiment

• Missing data not adequately reported

• Extreme values excluded

• Data ignored because they did not support the

hypothesis ?

Page 13: Statistics : the ten main mistakes

13

Analysis

• Failure to check assumptions of the statistical methods (M5)

homoscedasticity (for a t-test, a linear regression,…)

using a linear regression without first establishing linearity…correlation

• Ignoring informative "missing" datadeath and its consequencesdata below LOQ

• Choosing the question to get an answer• Multiple comparisons

Page 14: Statistics : the ten main mistakes

14

Homoscedasticity (M5)

1 Treatment

Cle

aran

ce

2

t-testP-value = 0.56

After log-transfP-value = 0.026

What the t-test can see

Page 15: Statistics : the ten main mistakes

15

Linearity/Correlation (M5) Linear regression

Correlation R = -0.002

Linear regression

Correlation R = -0.93

Page 16: Statistics : the ten main mistakes

16

Linearity/Correlation

Linear regression

Correlation R = 0.84

A linear model with 3 groups

Within group Correlation R = -0.92

Page 17: Statistics : the ten main mistakes

17

36.0

36.5

37.0

37.5

38.0

38.5

39.0

39.5

40.0

40.5

41.0

1 2 3 4 5 6

Time (Day)

Tem

pera

ture

(°C

)Ignoring data (M6)

36.0

36.5

37.0

37.5

38.0

38.5

39.0

39.5

40.0

40.5

41.0

1 2 3 4 5 6

Time (Day)

Tem

pera

ture

(°C

)

Page 18: Statistics : the ten main mistakes

18

Ignoring data

36.0

36.5

37.0

37.5

38.0

38.5

39.0

39.5

40.0

40.5

41.0

1 2 3 4 5 6

Time (Day)

Tem

pera

ture

(°C

)

36.0

36.5

37.0

37.5

38.0

38.5

39.0

39.5

40.0

40.5

41.0

1 2 3 4 5 6

Time (Day)

Tem

pera

ture

(°C

)

Page 19: Statistics : the ten main mistakes

19

Choosing the question to get an answer (M7)Occurs frequently in the presentation of clinical trials results

The question becomes random : it changes with the sample of animals. The question is chosen with its answer in hands… Think about a flip coin game where you win 1€ when tail or head occurs. You choose the decision rule once you know the result of the flip !

Such an approach increases the number of false discoveries.

Page 20: Statistics : the ten main mistakes

20

Multiple comparisons (M8)

1 2 3 4 5Mean 700 880 730 790 930SD 48 50 55 44 60

One wants to compare the ADG obtained with 5 different diets in pig

1 3 4 2 5Ten T-tests

A risk of 5% for each comparison : the global risk can be very large

Page 21: Statistics : the ten main mistakes

21

Interpretation/presentation

• Standard error and standard deviation

• P values : non significant effects

• False causality

Page 22: Statistics : the ten main mistakes

22

Standard error / standard deviation (M9)

The clairance of the drug was equal to 68 ± 5 mL/mn

Two possible meanings depending on the meaning of 5

If 5 is the standard error of the mean (se) there is 95 % chance that the population mean clearance belongs to

[68 - 2 5 ; 68 + 2 5 ]

If 5 is the standard deviation (SD) 95 % of animals have their clearance within

[68 - 2 5 ; 68 + 2 5 ]

Page 23: Statistics : the ten main mistakes

23

P values (M10)

The difference between the effect of the drugs A and B is not significant (P = 0.56) therefore drug A can be substituted by drug B.

NOThe only conclusion that can be drawn from such a P value is that you didn't see any difference between the effect of the drugs A and B. That does not mean that such a difference does not exist.

Absence of evidence is not evidence of absence

Page 24: Statistics : the ten main mistakes

24

P values (M10)

The drug A has a higher efficacy than the drug B (P = 0.001)The drug C has a higher efficacy than the drug B (P = 0.04) Since 0.001<0.04 the drug A has a higher than the drug B. NOThe only conclusion that can be drawn from such a P value is that you are sure than A>B and less sure than C>B.This does not presume anything about the amplitude of the differences.

Significant does not mean important

Page 25: Statistics : the ten main mistakes

25

False causality : lying with statistics

There is a strong positive correlation between the number of firefighters present at a fire and the amount of fire damage.Thus, the firefighters present at fire create higher fire damage !

The correlation coefficient is nothing else than a measure of the strength of a linear relationship between 2 variables.Correlation cannot establish causality.A strong correlation between X and Y can occurs when"X" causes "Y""Y" causes "X""Z" causes "X" and "Y" (Z = fire size in the previous example)Incidentally with small samples size when X and Y are independent

Page 26: Statistics : the ten main mistakes

26

How to avoid these mistakes ?

• Consult your prefered statistician for help in the design of complicated experiments• Use basic descriptive statistics first (graphics, summary statistics,…)• Use common sense• Consider to learn more statistics