common errors in statistics

34
AWARENESS PROGRAM ON STATISTICS COMMON ERRORS IN STATISTICS Dr. MURALIDHAR METTA M. V. Sc., PhD Assistant Professor Department of Animal Genetics and Breeding NTR College of Veterinary Science, GANNAVARAM

Upload: muralidharmetta

Post on 15-Jan-2016

20 views

Category:

Documents


2 download

DESCRIPTION

common errors

TRANSCRIPT

AWARENESS PROGRAM ON STATISTICS

COMMON ERRORS IN STATISTICS

Dr. MURALIDHAR METTAM. V. Sc., PhD

Assistant ProfessorDepartment of Animal Genetics and Breeding

NTR College of Veterinary Science, GANNAVARAM

AWARENESS PROGRAM ON STATISTICS

Problems associated with statistical

analysis of biological data

• Misuse

• Misinterpretation

• Methodological limitations

AWARENESS PROGRAM ON STATISTICS

The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify." Darrell Huff (1954)

AWARENESS PROGRAM ON STATISTICS

"Almost every student of probability and statistics simply memorizes the rules. Most ... select their methods blindly, understanding little or nothing of the basis for choosing one method rather than another. This often leads to wildly inappropriate practices, and contributes to the damnation of statistics." Julian Simon and Peter Bruce (1999)

AWARENESS PROGRAM ON STATISTICS

"It is often easier to get a paper published if one uses erroneous statistical analysis than if one uses no statistical analysis at all." Stuart Hurlbert & Celia Lombardi (2003)

AWARENESS PROGRAM ON STATISTICS

The problem of poor statistical reporting is, in fact, longstanding, widespread, potentially serious, and not well known, despite the fact that most mistakes concern basic statistical concepts and can be easily avoided by following a few guidelines. Tom Lang (2004)

AWARENESS PROGRAM ON STATISTICS

It's science's dirtiest secret: The "scientific method" of testing hypotheses by statistical analysis stands on a flimsy foundation.... Even when performed correctly statistical tests are widely misunderstood and frequently misinterpreted." Siegfried, T. (2010)

AWARENESS PROGRAM ON STATISTICS

Some errors

AWARENESS PROGRAM ON STATISTICS

Confusing statistical significance

with clinical importance

AWARENESS PROGRAM ON STATISTICS

Confusing statistical significance

with clinical importance• Small differences between large groups

can be clinically meaninglesso Milk yield in large herds

• Large differences between small groups can be clinically important but not statistically significanto Treatment of cancer

AWARENESS PROGRAM ON STATISTICS

Not defining “normal” or “abnormal”

when reporting diagnostic test

results

AWARENESS PROGRAM ON STATISTICS

Not defining “normal” or “abnormal”

when reporting diagnostic test

results• The importance of statistical differences in

diagnostic test results depends on how a “normal” or abnormal value is defined

AWARENESS PROGRAM ON STATISTICS

Not defining “normal” or “abnormal”

when reporting diagnostic test

results• Diagnostic definition of normal:

o range of measurements over which the disease is absent and beyond which it is likely to be present

• Statistical definition: o measurements taken from a disease free population. o Assumes that the test results are normally

distributed. o Normal range is range of measurements includes 2

SD above and below the meano The highest and lowest 2.5% of values are abnormalo Not many test results are normally distributedo Eg: Serum-creatinine; Hb

AWARENESS PROGRAM ON STATISTICS

Assuming Correlation = Causation

AWARENESS PROGRAM ON STATISTICS

Assuming Correlation = Causation

• Sometimes correlations are over used• True in case of some studies publicized in

the media• Correlation does not imply causation

o Herd size vs body weight - if large herd tend to have higher body weights

o Herd size is causing increase in body weight?

AWARENESS PROGRAM ON STATISTICS

Misinterpreting Overlapping

Confidence Intervals

AWARENESS PROGRAM ON STATISTICS

Misinterpreting Overlapping

Confidence Intervals• Standard deviation – spread of data• Standard error – accuracy of mean• Confidence interval (CI) – uncertainity associated

with sampling methods

AWARENESS PROGRAM ON STATISTICS

Misinterpreting Overlapping

Confidence Intervals

AWARENESS PROGRAM ON STATISTICS

Misuse of standard deviation and

standard error

AWARENESS PROGRAM ON STATISTICS

Misuse of standard deviation and

standard error• A related misuse of the standard error is to use it

as a descriptive statistic when it is in fact an inferential statistic

• Under normal distribution of the data standard deviation is the correct descriptive statistic to use as an indicator of variability between observations

• The standard error only reflects this variability for a particular sample size

• SD – observations• SE - Mean

AWARENESS PROGRAM ON STATISTICS

Interpreting studies with non-significant results and low statistical power as negative when they are in

fact inconclusive

AWARENESS PROGRAM ON STATISTICS

Interpreting studies with non-significant results and low statistical power as negative when they are in

fact inconclusive• The absence of proof is not proof of

absence• Statistical power is ability to detect a

difference of a given size• Several studies that report non-statistically

significant findings are under powered – hence they are inconclusive

AWARENESS PROGRAM ON STATISTICS

Not Distinguishing Between

Statistical Significance and Practical

Significance

AWARENESS PROGRAM ON STATISTICS

Not Distinguishing Between

Statistical Significance and Practical

Significance• It's important to remember that using statistics,

we can find a statistically significant difference that has no discernible effect in the "real world"

• Just because a difference exists doesn't make the difference important o Eg: pet foods package

AWARENESS PROGRAM ON STATISTICS

Not confirming that the data met the

assumptions of the statistical tests used

to analyze them

AWARENESS PROGRAM ON STATISTICS

Not confirming that the data met the

assumptions of the statistical tests used

to analyze them • There are hundreds of statistical tests, and

several may be appropriate for a given analysis • However, tests may not give accurate results if

their assumptions are not met • For this reason, both the name of the test and a

statement that its assumptions were met should be included in reporting every statistical analysis

AWARENESS PROGRAM ON STATISTICS

Not confirming that the data met the

assumptions of the statistical tests used

to analyze them • Some common problems are –

o Using parametric tests when the data are not normally distributed (skewed)

o Using tests for independent samples on paired samples, which require tests for paired data

AWARENESS PROGRAM ON STATISTICS

Not confirming that the data met the

assumptions of the statistical tests used

to analyze them

AWARENESS PROGRAM ON STATISTICS

Finally……

AWARENESS PROGRAM ON STATISTICS

For error free statistical analysis

• Set forth your objectives and the use you plan to make of your research before you conduct a laboratory experiment, a clinical trial, or survey and before you analyze an existing set of data.

• Define the population to which you will apply the results of your analysis

• List all possible sources of variation. Control them or measure them to avoid their being confounded with relationships among those items that are of primary interest

AWARENESS PROGRAM ON STATISTICS

• Formulate your hypothesis and all of the associated alternatives. List possible experimental findings along with the conclusions you would draw and the actions you would take if this or another result should prove to be the case. Do all of these things before you complete a single data collection form and before you turn on your computer

• Describe in detail how you intend to draw a representative sample from the population

For error free statistical analysis

AWARENESS PROGRAM ON STATISTICS

• Know the assumptions that underlie the tests you use. Use those tests that require the minimum of assumptions and are most powerful against the alternatives of interest

• Incorporate in your reports the complete details of how the sample was drawn and describe the population from which it was drawn. If data are missing or the sampling plan was not followed, explain why and list all differences between data that were present in the sample and data that were missing or excluded.

For error free statistical analysis

AWARENESS PROGRAM ON STATISTICS

THANK YOU