statistical issues. statement of the problem how often are articles published with errors in...

Statistical IssuesStatistical Issues

Statement of the ProblemStatement of the Problem

How often are articles published with errors in statistical methods?– So what? Should we believe only articles with

perfect methodology? What is “statistical significance”?

– Why is the definition given a “modest statement”?

Background

Why don’t authors refine their methods until they’ve removed the role of chance? Then we could forget statistics altogether. We’d like to be certain that our therapies work, that we only apply them when necessary, and that they are not harmful.

The last sentence of the first full paragraph on page 550 implies that—even if the studies are well designed and analyzed—1 in every 20 articles reports incorrect results. Can’t we be sure? Isn’t there a better way?

5 Parts5 Parts

What are the 5 parts of a “well-designed study”? How do you read a study and decide whether

each of these parts have been adequately met?

IntroIntro

In a paper’s Introduction section, what do you look for?– In the “review of the literature,” what do you

look for?– No study is comprehensive. How do you assess

whether the “statement of purpose” of this paper is clear, relevant and potentially useful?

– How do you assess if the paper has a “clear concept about the specific hypothesis to be tested”?

Methods-ParticipantsMethods-Participants

In a paper’s Methodology section, what do you look for? Most methods sections begin with a description of the study participants. How do you assess each of the following?– Was the study population representative of some

typical group?– What forms of bias may be acting in the selection

of the study population?– What forms of bias were enacted by the

inclusion/exclusion criteria?– How do assess whether these study participants are representative of the population of interest?

Methods-Study DesignMethods-Study Design

Most methods sections include a discussion of the study design. Designs usually include comparison groups. Why? Consider the following:– What was the control group (what did this

control for? And not control for?)– What was the positive intervention tested in

the study?– Exactly how are the participants assigned to

the study groups? (randomly?)– How do we assess possible biases in group

membership?

Methods-MeasurementMethods-Measurement

All methods sections include discussion of what was measured and how.Is the description of how and what was measured sufficient? – Is the measurement process reproducible?– Who was blind to the experimental groups:

Those measuring? Those applying the intervention? The study participants? Exactly when was the data unblinded?

– If evaluations are subject to interpretation, are assessments by more than one blinded evaluator made?

Measurement-issues not Measurement-issues not addressed in Baumgardneraddressed in Baumgardner

Are multiple units measured within an individual?That is, in dentistry it is possible to have “split-mouth” designs, to measure multiple quadrants, to measure multiple teeth, to measure multiple sites within teeth, to have multiple sections (layers) of tissue.How are these multiple measures handled?

Does an individual receive one and only one intervention or multiple interventions?

Additionally, some studies measure characteristics over time (on multiple occasions). Thus, each individual can act as its own control. However, these multiple time-points must be taken into account in the analyses.

Table IITable IITable II. Summary of statistical tests by the type of data collected and the research design used

Two Tx groups Two Tx groups (paired, Three or more Three or more Tx groupsScale of measurement

(unpaired) i.e., before and after) Tx groups Multiple measurements in the same individual

Quantitative (and from normally distributed populations)

Unpaired t-test Paired t-test Analysis of variance

Repeated-measures analysis of variance

Ordinal or nonparametric quantitative data

Mann-Whitney test Wilcoxon signed rank test

Kruskal-Wallis tet

Friedman test

Nominal Chi-squared analysis or Fisher's exact probability test

McNemar's test Chi-squared Cochran Q*

*Not covered in this discussion

Type of experiment

Tests are listed with those that require the most information from the data at the top of each column to those that require the least amount of information at the bottom of each column. It is statistically legitimate to use data with more information in a "lower" test (i.e., quantitative data can be used in a parametric, ordinal or nominal test); information and power to detect subtle differences are lost, however. It is inappropriate to use lower data in a higher test (i.e., going up the columns); this adds information to data that were not originally collected.

statistical issues. statement of the problem how often are articles published with errors in...

Documents