common errors in statistics

AWARENESS PROGRAM ON STATISTICS

COMMON ERRORS IN STATISTICS

Dr. MURALIDHAR METTAM. V. Sc., PhD

Assistant ProfessorDepartment of Animal Genetics and Breeding

NTR College of Veterinary Science, GANNAVARAM

Problems associated with statistical

analysis of biological data

• Misuse

• Misinterpretation

• Methodological limitations

The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify." Darrell Huff (1954)

"Almost every student of probability and statistics simply memorizes the rules. Most ... select their methods blindly, understanding little or nothing of the basis for choosing one method rather than another. This often leads to wildly inappropriate practices, and contributes to the damnation of statistics." Julian Simon and Peter Bruce (1999)

"It is often easier to get a paper published if one uses erroneous statistical analysis than if one uses no statistical analysis at all." Stuart Hurlbert & Celia Lombardi (2003)

The problem of poor statistical reporting is, in fact, longstanding, widespread, potentially serious, and not well known, despite the fact that most mistakes concern basic statistical concepts and can be easily avoided by following a few guidelines. Tom Lang (2004)

It's science's dirtiest secret: The "scientific method" of testing hypotheses by statistical analysis stands on a flimsy foundation.... Even when performed correctly statistical tests are widely misunderstood and frequently misinterpreted." Siegfried, T. (2010)

Some errors

Confusing statistical significance

with clinical importance

Confusing statistical significance

with clinical importance• Small differences between large groups

can be clinically meaninglesso Milk yield in large herds

• Large differences between small groups can be clinically important but not statistically significanto Treatment of cancer

Not defining “normal” or “abnormal”

when reporting diagnostic test

results

results• The importance of statistical differences in

diagnostic test results depends on how a “normal” or abnormal value is defined

results• Diagnostic definition of normal:

o range of measurements over which the disease is absent and beyond which it is likely to be present

• Statistical definition: o measurements taken from a disease free population. o Assumes that the test results are normally

distributed. o Normal range is range of measurements includes 2

SD above and below the meano The highest and lowest 2.5% of values are abnormalo Not many test results are normally distributedo Eg: Serum-creatinine; Hb

Assuming Correlation = Causation

• Sometimes correlations are over used• True in case of some studies publicized in

the media• Correlation does not imply causation

o Herd size vs body weight - if large herd tend to have higher body weights

o Herd size is causing increase in body weight?

Misinterpreting Overlapping

Confidence Intervals

Confidence Intervals• Standard deviation – spread of data• Standard error – accuracy of mean• Confidence interval (CI) – uncertainity associated

with sampling methods

Confidence Intervals

Misuse of standard deviation and

standard error

Misuse of standard deviation and

standard error• A related misuse of the standard error is to use it

as a descriptive statistic when it is in fact an inferential statistic

• Under normal distribution of the data standard deviation is the correct descriptive statistic to use as an indicator of variability between observations

• The standard error only reflects this variability for a particular sample size

• SD – observations• SE - Mean

Interpreting studies with non-significant results and low statistical power as negative when they are in

fact inconclusive

Interpreting studies with non-significant results and low statistical power as negative when they are in

fact inconclusive• The absence of proof is not proof of

absence• Statistical power is ability to detect a

difference of a given size• Several studies that report non-statistically

significant findings are under powered – hence they are inconclusive

Not Distinguishing Between

Statistical Significance and Practical

Significance

Not Distinguishing Between

Statistical Significance and Practical

Significance• It's important to remember that using statistics,

we can find a statistically significant difference that has no discernible effect in the "real world"

• Just because a difference exists doesn't make the difference important o Eg: pet foods package

Not confirming that the data met the

assumptions of the statistical tests used

to analyze them

to analyze them • There are hundreds of statistical tests, and

several may be appropriate for a given analysis • However, tests may not give accurate results if

their assumptions are not met • For this reason, both the name of the test and a

statement that its assumptions were met should be included in reporting every statistical analysis

to analyze them • Some common problems are –

o Using parametric tests when the data are not normally distributed (skewed)

o Using tests for independent samples on paired samples, which require tests for paired data

to analyze them

Finally……

For error free statistical analysis

• Set forth your objectives and the use you plan to make of your research before you conduct a laboratory experiment, a clinical trial, or survey and before you analyze an existing set of data.

• Define the population to which you will apply the results of your analysis

• List all possible sources of variation. Control them or measure them to avoid their being confounded with relationships among those items that are of primary interest

• Formulate your hypothesis and all of the associated alternatives. List possible experimental findings along with the conclusions you would draw and the actions you would take if this or another result should prove to be the case. Do all of these things before you complete a single data collection form and before you turn on your computer

• Describe in detail how you intend to draw a representative sample from the population

• Know the assumptions that underlie the tests you use. Use those tests that require the minimum of assumptions and are most powerful against the alternatives of interest

• Incorporate in your reports the complete details of how the sample was drawn and describe the population from which it was drawn. If data are missing or the sampling plan was not followed, explain why and list all differences between data that were present in the sample and data that were missing or excluded.

To read…..

• Statistical Mistakes in research…http://influentialpoints.com/Training/statistical_mistakes_in_research_use_and_misuse_of_statistics_in_biology.htm

THANK YOU

common errors in statistics

gannavaram awareness

causationawareness program

statistical tests

erroneous statistical

normal range

basic statistical concepts

range of measurements

statistics common errors

Documents

common pronunciation errors

1.3: uses and abuses of statistics objective: to identify...

common essay errors

common rater errors

errors and statistics

common writing errors

experimental errors & statistics

common sentence errors how to identify and revise common...

6. debugging. © oscar nierstrasz st — debugging 6.2...

common punctuation errors

datastage common errors

common errors v2

common usage errors

eight common statistical traps -...

common math errors · 2020. 7. 6. · common math errors...

common errors

common administration errors

common positioning errors

common english errors

common errors - eng.kuleuven.be