introduction to biostatistics and epidemiology...introduction to biostatistics and epidemiology...

35
Introduction to Biostatistics and Epidemiology Vashini Pillay [email protected]

Upload: others

Post on 24-Jul-2020

25 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Introduction to Biostatistics and

Epidemiology

Vashini [email protected]

Page 2: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Philosophical background

• Basic premise: there is an external, objective

“truth” that applies to the whole population

• We will never know the Truth

• We can estimate the Truth by testing a sample

of the population

• Make some inferences about the whole

population

• Question:

– How well does this estimate represent the Truth?

Page 3: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

3 Questions

1) Is the sample data representative of the population (ie free of bias)– Can’t answer this question with statistical methods– Need to examine how the data was collected

2) “Is there an association” – Look at point estimate (RR / OR / RRR / ARR / NNT)

3) How likely is it that this result occurred by chance? – P values and confidence intervals may help

Page 4: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

1. Bias

• A systematic error in the design, conduct or analysis of the study which results in a mistaken estimate of the exposure-outcome relationship.

• It is NOT due to random variability (ie:chance).

Page 5: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

1. Bias

• Selection Bias:

-systematic errors in the selection of subjects

(the manner in which subjects are selected into the study leads to

systematic differences in the distributions of these subjects in the

exposure/outcome groups compared to the original source

population)

• Information Bias:

-systematic errors in the collection of information

from study subjects(some of the information collected with regards to either the exposure

or the outcome is incorrect resulting in study subjects being

misclassified into the incorrect study group)Anna Grimsrud March 2009

Page 6: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

1. Bias

• Types of Selection Bias:

- prevalence bias

- participation bias

- sampling bias

-LTFU

• Types of Information Bias:

- recall bias

- measurement bias

- observer bias

- assessment biasAnna Grimsrud March 2009

Page 7: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Study Designs

• Observational Studies:- Case Study

- Case Series

- Cross-sectional

- Cohort

- Case Control

• Experimental Studies:

- Randomized Controlled Trials

Page 8: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Study Designs

• Influence the way in which we sample study population

• Influence the way in which we measure / collect data

• Influence the manner in which we analyse data

Page 9: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

2. Is there an Association?

• Risk Ratio

the ratio of the risk of developing the outcome of

interest (eg:disease) in the exposed subjects to

the risk of developing the outcome of interest in

the non-exposed subjects.

RR > 1 : Risk of disease is greater among the exposed than

among the non-exposed

RR = 1: Risk of disease is the same among the exposed and

the non-exposed

RR < 1: Risk of disease is less among the exposed than

among the non-exposed (ie: protective effect)

Page 10: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

2. Is there an Association?

• Odds Ratio

the ratio of the odds of developing the outcome of

interest (eg:disease) in the exposed subjects to

the odds of developing the outcome of interest in

the non-exposed subjects.

OR > 1 : Odds of disease is greater among the exposed than

among the non-exposed

OR = 1: Odds of disease is the same among the exposed and

the non-exposed

OR < 1: Odds of disease is less among the exposed than

among the non-exposed (ie: protective effect)

Page 11: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

2. Is the Association real ?

– To “detect an association” that isn’t real = type 1 error

(False Positive in Diagnostic Testing)

– To “miss an association” that is actually there = type 2 error

(False Negative in Diagnostic Testing)

• This decision should be reviewed and quantified on every analysis

Page 12: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

3. How likely is it that this result

occurred by chance?

• P-values deal with probability of the estimate

• Confidence Intervals also deal with the precision of the estimate

Page 13: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Null Hypothesis

• The Null Hypothesis traditionally states that there is no difference in association / relationship between 2 measured phenomena (default / reference point)

• Alternative Hypothesis states that there

is a difference in association / relationship between 2 measured phenomena.

Page 14: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

P-Value

• “The probability of obtaining a result as extreme as this, assuming the Null Hypothesis is true”

• A measure to quantify your degree of certainty with regards to the result obtained (ie: the estimate of the probability that the result obtained has

occurred by pure statistical chance/accident).

• The smaller the p-value, the more likely you are to reject the Null (ie: the observed association is very unlikely to have occurred by chance alone)

Page 15: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

P-Value

• P-value = 0.5 means that the probability of the result obtained having happened by chance is 1 in 2.

• P-value =0.05 means that the probability of the result obtained having happened by chance is 1 in 20.

• P-value = 0.01 means that the probability of the result obtained having happened by chance is 1 in 100.

• P-value = 0.001 means that the probability of the result obtained having happened by chance is 1 in 1000

Page 16: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

P-Value

• Traditionally a p-value < 0.05 or less rejects the Null Hypothesis at the 5% significance level suggesting statistical significance.

BUT… in terms of clinical significance:

• Is a p-value of 0.049 very different from that of 0.05???

• Is a p-value of 0.051 very different from that of 0.05???

Page 17: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Problems with P-values

• Statisticians hate them (for many complex reasons)

– Major abuse of p value:• Label variable S or NS (significant or not significant)

-based on a single threshold value

-without looking at the magnitude of the effect

-without looking at the clinical significance of the effect

– Eg a cancer etiology study shows • Suggestive evidence of an enormous increase in risk with chemical A

– Risk ratio 13.4, p=0.051

• Strong evidence of a small increase in risk with chemical B– Risk ratio 1.10, p=0.001

– Chemical A: 13 times increased risk

– Chemical B: 10% increased risk

Never mind the p value, which chemical are you more afraid of?

Page 18: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Common misconceptions of P-Values

• P = 0.05 does not mean there is only a 5% chance that the null hypothesis is true.

• P = 0.05 does not mean there is a 5% chance of a Type I error (i.e. false positive).

• P = 0.05 does not mean there is a 95% chance that the results would replicate if the study were repeated.

• P > 0.05 does not mean there is no difference between groups.

• P < 0.05 does not mean you have proved your experimental hypothesis.

Goodman S.A. Ann Intern Med. 1999;130:995-1004

Page 19: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Confidence interval

• Emphasis on precision of the estimate

- provides a range of values in which the estimate obtained through ones analysis, would be considered precise.

• Derived from same underlying parameters (variance and

sample size)

• Provides us with more information than a p-value

Page 20: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Normal distribution

• For a normally shaped distribution, 1 standard deviation on either side of the mean contains 66% of the estimates

• 2 standard deviations contain on either side of the mean contains 97% of the estimates

• 1.96 standard deviations contain 95% of the estimates

Page 21: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Sampling distribution showing effect of sampling error (SE).

Sheldon T A Evid Based Nurs 2000;3:36-39

©2000 by BMJ Publishing Group Ltd and RCN Publishing Company Ltd

Page 22: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Calculate 95% confidence interval

• Calculate sample statistic (point estimate)

• Calculate “standard error” (SE) of the statistic

– “Standard error” is similar to standard deviation

(ie: standard deviation of the sample population)

– Measure of the “spread” of the data

– Affected by sample size

• Calculate 1.96 x SE

• Upper limit: Point estimate + 1.96xSE

• Lower limit: Point estimate -1.96xSE

Page 23: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Sample size

• Larger sample size – smaller standard error – narrower confidence intervals

• Greater precision

• If the upper limit and lower limit include / cross the value 1, then the result is notstatistically significant (at 5% level)

Page 24: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

The larger the sample (n), the smaller the sampling error.

Sheldon T A Evid Based Nurs 2000;3:36-39

©2000 by BMJ Publishing Group Ltd and RCN Publishing Company Ltd

Page 25: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Reporting a confidence interval

• “The truth” existed before you took your sample, it is what it is, and it is unchanging

• Your estimates may be variable (depending on how you took the sample, how many times you repeat the test)

• The truth is fixed, your estimates are flexible

• “We can say with 95% confidence that this interval includes/covers/overlaps the Truth”

• You cannot say: “The truth falls within this interval”– Implies that your borders are fixed, and the truth is variable, may

“fall” here or “fall” there

– The truth is fixed! Your intervals are variable

Page 26: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Advantage of confidence intervals

• Can see size of effect

• Width of confidence interval gives idea of the “stability” of the estimate

– Sample size?

– Effect of some extreme outlier values?

– Random error?

• Narrow CI’s are always better!

Page 27: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

P Value vs CI

• Hypothetical disease:

• Which exposures are statistically significant? (p values)

• Which has widest CI? (most affected by random error)

• Which are most precise (most trustworthy, less likely to change with repeat testing)?

Exposure Relative

risk

95% CI P

value

A 2.1 0.6 – 7.8 0.24

B 1.6 1.3 – 2.0 0.001

C 4.4 1.5 – 12.4 0.002

Page 28: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Past paper examples: Mar 2008

• You are interested in interventions that could be used in the area that could prevent relapse after discharge in children with severe malnutrition treated at the hospital. You find the following article during an evidence-base search

• “Home based therapy for severe malnutrition with ready-to-use food “– M J Manary, M J Ndkeha, P Ashorn, K Maleta, A Briend

• Background: – The standard treatment of severe malnutrition in Malawi often utilises prolonged inpatient

care, and after discharge results in high rates of relapse.

• Aims: – To test the hypothesis that the recovery rate, defined as catch-up growth such that weight-

for-height z score >0 (WHZ, based on initial height) for ready-to-use food (RTUF) is greater than two other home based dietary regimens in the treatment of malnutrition.

• Methods: – HIV negative children >1 year old discharged from the nutrition unit in Blantyre, Malawi were

randomised to one of three dietary regimens: RTUF, RTUF supplement, or blended maize/soy flour. RTUF and maize/soy flour provided 730 kJ/kg/day, while the RTUF supplement provided a fixed amount of energy, 2100 kJ/day.

– Children were followed fortnightly. Children completed the study when they reached WHZ >0, relapsed, or died.

– Outcomes were compared using a time-event model.

Page 29: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

• Results:– A total of 282 children were enrolled. – Children receiving RTUF were more likely to reach WHZ >0 than those

receiving RTUF supplement or maize/soy flour (95% v 78%, RR 1.2, 95% CI 1.1 to 1.3).

– Intention to treat analyses also showed that more children receiving RTUF reached graduation weight than those receiving RTUF supplement or maize/soy (86% v 66%, 20% difference, 95% CI 8% to 33%).

– The average weight gain was 5.2 g/kg/day in the RTUF group compared to 3.1 g/kg/day for the maize/soy and RTUF supplement groups. Six months later, 96% of all children who reached graduation weight and returned for follow up, had normal anthropometric indices

• Abbreviations: – MUAC, mid-upper arm circumference; NRU, nutritional rehabilitation

unit; RTUF, ready-to-use food; WHZ, weight-for-height z score

Page 30: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

• a) What type of study design was used? (1)

• b) What was the main study outcome? (1)• c) How would you interpret a weight-for-height z-score of 0 and -1? (2)

• d) How would you interpret the relative risk of 1.2 in the statement “Children receiving RTUF were more likely to reach WHZ >0 than those receiving RTUF supplement or maize/soy flour (95% v 78%, RR 1.2, 95% CI 1.1 to 1.3).”? (1)

• e) How would you interpret the 95% confidence interval of 1.1 to 1.3 in the same sentence? (1)

• f) Was this difference statistically significant? Explain. (1)

• g) What statistical test would you use to decide if the average weight gain in the RTUF group was significantly different to the maize/soy and RTUF supplement groups? (1)

• h) What would your conclusion be about managing children with severe malnutrition at home, based on this Study? (3)

Page 31: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Sept 2010:

Page 32: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there
Page 33: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there

Sept 2008: ex prem, spastic di

Page 34: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there
Page 35: Introduction to Biostatistics and Epidemiology...Introduction to Biostatistics and Epidemiology Vashini Pillay vash.pillay@uct.ac.za Philosophical background • Basic premise: there