is biostatistics necessary?
TRANSCRIPT
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.
Copyright 2006, The Johns Hopkins University and Jonathan M. Samet. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.
Department of Epidemiology
Is Biostatistics Necessary?A Non-Systematic Review of the Evidence
Jonathan M. Samet, MD, MS
PubMed “Hits” on Biostatistics1 and Epidemiology, 19822 - 2004
1
10
100
1000
10000
100000
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
BiostatisticsEpidemiology
1 “English language” – only qualifier2 1982 – Scott Zeger is appointed to the faculty of the School of Hygiene and Public Health
1. Why biostatistics is irrelevant
2. A cause is a cause3. Ocular data analysis4. Finding haystacks not
needles5. The seven deadly sins
of biostatistics6. When is biostatistics
unavoidable?7. Tips on the care and
feeding of biostatisticians.
Al Sommer on Data
• “Don’t pose a question, give the data to your statisticians, and ask them ‘What’s the p value?’”Sommer advises.” If I had done that I would have missed the entire vitamin A mortality story.”
Source: Lancet, Feb 19, 2005
Sommer on Data
• “He still loves to steep himself in the data. “I say ‘data talk to me, tell me what you have to say’”. Often, though, the answers come at odd times, Sommer says. “You don’t get the insights you need—either the answer or how you are going to approach a question—while you are actively thinking about it.”Source: Lancet, Feb 19, 2005
Sommer on Data“You have to know your data, you have to smell it, you have to be in it”, he says. “If you’re not living inside the data you are going to miss the most interesting things, because the most interesting things are not going to be the questions your originally proposed, the interesting things are going to be questions you hadn’t thought about.”Source: Lancet, Feb 19, 2005
“The real purpose of the scientific method is
to make sure Nature hasn’t misled you into
thinking you know something you don’t
actually know.”
(Robert M. Pirsig, 1974)
The 1964 Surgeon General’s Report
• “Statistical methods cannot establish proof of a causal relationship in an association. The causal significance of an association is a matter of judgment which goes beyond any statement of statistical probability”.
Raymond Pearl, 1938: Smoking Shortens Lifespan
Source: Adapted by CTLT from Pearl, Science 1938
Raymond Pearl, 1879-1940
1952 London Fog
This is a graph shown in several documents published shortly after the episode. Showing the high levels of pollution and the similar patterns in daily mortality.Adapted by CTLT
John W. Tukey on His Book, Exploratory Data Analysis
• This book is based on an important principle:• It is important to understand what you CAN DO
before you learn to measure how WELL you seem to have DONE it.
• Learning first what you can do will help you to work more easily and effectively.
• This book is about exploratory data analysis, about looking at data to see what it seems to say. It concentrates on simple arithmetic and easy-to-draw pictures. It regards whatever appearances we have recognized as partial descriptions, and tries to look beneath them for new insights. Its concern is with appearance, not with confirmation.(Tukey, 1977)
Discussion of “Role of Statistics in National Health Policy Decisions”
• The time spent by the medical members of the Surgeon-General’s committee on “analyzing data and interpreting it”encourages me. The analysis and interpretation of data can neither be a domain left to statisticians nor one over which statistician’s rule as tyrants. There will always be too few statisticians; they must spread the insight, the techniques, and the attitudes as widely as possible.
(Tukey, 1976)
Small Sample Gems
• They exist! For example:–DES and vaginal adenocarcinoma–Uranium mining and lung cancer–Vinyl chloride and angiosarcoma
of the liver
Adenocarcinoma of the Vagina: Association of Maternal Stilbestrol Therapy with Tumor
Appearance in Young Women• Adenocarcinoma of the vagina in young women has been
recorded rarely before the report of several cases treated at the Vincent Memorial Hospital between 1966 and 1969. The unusual occurrence of this tumor in eight patients born in New England hospitals between 1946 and 1951 led us to conduct a retrospective investigation in search of factors that
might be associated with tumor appearance…. Most significantly, Most significantly, seven of the eight mothers of patients with seven of the eight mothers of patients with carcinoma had been treated with carcinoma had been treated with diethylstilbestrol started during the first diethylstilbestrol started during the first trimester. None in the control group were so trimester. None in the control group were so treated (p less than 0.00001).treated (p less than 0.00001). Maternal ingestion of stilbestrol during early pregnancy appears to have enhanced the risk of vaginal adenocarcinoma developing years later in the offspring exposed.
Source: Herbst, U
lfelderH, PoskanzerD
C.
Uranium Mining and Navajo Men
“The association between uraniummining and lung cancer was statistically significant (p = 1.1 x 10-11).”
Source: Samet et al. NEJM 1984
Finding Haystacks not Needles
• For large effects, who needs a p value?
• Principles–Small numbers, large effect– Worry–Bias > Chance > Cause–Publish? or Perish?
The Seven Deadly Sins of Biostatistics
• P valuing• Modeling not thinking• Model as message• Kitchen sink modeling• Seduction by sophistication• Picking the prior• Intimidating the naive
P-Valuing: A Recent Example
• A Manuscript Reviewed
• Study of race and treatment (N=240)• Key finding: OR for association of black
vs white for being offered treatment = 0.49 (p=0.09)
• Author interpretation: No association• Samet interpretation: Key finding
Relative Risk of breast cancer according to quintile of adolescent caloric and fat intake in women in the NHS II
a Multivariate model was adjusted for age, time period (two year interval), height (<62, 62–<65, 65–<68, 68þ in.), parity and age at first birth (nulliparous, parity £2 and age at first birth <25 years, parity £2 and age at first birth 25–<30 years, parity £2 and age at first birth 30þ years, parity 3þ and age at first birth <25 years, parity 3þ and age at first birth 25þ years), body mass index at age 18 (<18.5, 18.5–22.4, 22.5–29.9, 30.0þ kg/m2), age at menarche (<12, 12, 13, ‡14 years), family history of breast cancer (yes, no), history of BBD (yes, no), menopausal status(premenopausal, postmenopausal, dubious, unsure), alcohol intake (non-drinkers, <5, 5–<10, 10–<20, 20þ g/d), oral- contraceptive use (never, past ‡4 years, past <4 years, current <8 years, current ‡8 years), weight gain since age 18 (weight loss greater than 5 kg, weight gain or loss 5 kg, weight gain 5–10 kg, weight gain 10–20 kg, weight gain >20 kg). (Frazier et al, 2004)
Kitchen Sink Modeling
Intimidating by Sophistication
• The model was fitted with the Efronmethod for ties and a robust variance estimator to account for patient-episode level clustering, using Stata 7.0 software (College Station, TX, USA). The proportional-hazards assumption was assessed with log-log survival plots and, formally, with scaled Schoenfeldresiduals (Stata).
(Cepeda et al., 2005)
Air pollution signal order of magnitude smaller than confounders
Estimates of model predictors in the GAM model Pittsburgh (1987-1994)Adapted by CTLT
Adapted by CTLT from: Jonathan M. Samet, M.D., Francesca Dominici, Ph.D., Frank C. Curriero, Ph.D., Ivan Coursac, M.S., and Scott L. Zeger, Ph.D. New England Journal of Medicine
Gibson’s Law
• For every Ph.D. there’s an equal and opposite Ph.D.
Or for every biostatistician, there’s an equal and opposite biostatistician.