sadc course in statistics basic summaries for epidemiological studies (session 04)

19
SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

Upload: katherine-cobb

Post on 28-Mar-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

SADC Course in Statistics

Basic summaries for epidemiological studies

(Session 04)

Page 2: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

2To put your footer here go to View > Header and Footer

Learning Objectives

At the end of this session, you will be able to

• correctly distinguish and use ideas of prevalence and incidence

• explain the concepts of risk in relation to health outcomes, and of what may be “causal” factors

• use the concepts of relative risk and odds ratio in relation to simple epidemiological studies

Page 3: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

3To put your footer here go to View > Header and Footer

Attribute Data

An attribute is an ascertainable characteristic either present or absent in an individual, so that the “measurement” on an individual can be represented as either 1 or 0.

Many measures in epidemiology are of this type e.g. a test for HIV seropositivity yields such a 0/1 response. This may still involve expert interpretation & judgment, with possibility of false positives and false negatives.

Page 4: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

4To put your footer here go to View > Header and Footer

Point Prevalence

Prevalence concerns the number of instances of attribute in the popn, usually at a point in time, relative to the number at risk, i.e. expressed as a proportion, a percentage, per 1000 or even per million where +s are rare. So point prevalence (as a %age) is

No. individuals with + attribute at time point

No. of indiv.s in population at risk at time point X 100

Page 5: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

5To put your footer here go to View > Header and Footer

Period Prevalence

This refers to number of cases known to have been prevalent during a period e.g. a year.

Numerator above wd be replaced by sum of (1) no. of prevalent cases at start of year, and (2) no. of new cases arising during the period.

Denominator usually then a mid-year figure for population at risk.

Page 6: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

6To put your footer here go to View > Header and Footer

Prevalence: notes

• Occasionally “prevalence” is used for absolute number of cases/instances – best not to call this “prevalence”!

• Both point and period prevalence are snapshot figures. They are NOT rates.

• Period prevalence sensible for short-duration condition where numbers can rise/fall fast.

• No. “at risk” needs thought e.g. males only for prostate conditions.

• Prevalences can be age-specific.

Page 7: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

7To put your footer here go to View > Header and Footer

Incidence

Incidence (always a rate ~ a flow statistic) as a population measure is normally on a yearly rate basis. As a proportion:-

No. of new cases arising in a period of 1 yr.

Mid-yr. population at risk

As with prevalence, often put as %, ‰ etc

• Watch out for non-experts confusing or misusing the terms prevalence & incidence!

Page 8: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

8To put your footer here go to View > Header and Footer

Relationship of prevalence & incidence

When prevalence P is relatively small and condition is of limited duration (say averaging time T) and population is in a “steady state”, then approximately:-

P = I x T

where I = incidence.

Exercise ~ try to express in words a rough justification for the above expression.

Page 9: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

9To put your footer here go to View > Header and Footer

Probability, risk or cumulative incidence

Sometimes a study population is relatively small, or a sample can be followed up. Then we can calculate “risk” or cumulative incidence as:-

No. new cases arising in one year

No. healthy individuals in popn at start of yr

This is then an estimated probability; note the mortality rate of session 14 is an example of this.

Page 10: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

10To put your footer here go to View > Header and Footer

Sources of risk: 1

Much of epidemiology concerns “risk factors” that may be “causes” of the disease.

There are logical difficulties in proving causation, & often a complex set of pre-disposing and influencing factors.

In simplest case, consider just one risk factor e.g. cigarette smoking, and reduce the risk factor ~ as well as disease attribute ~ to present/absent.

Discuss what might be more realistic model!

Page 11: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

11To put your footer here go to View > Header and Footer

Sources of risk: 2

With one Yes/No attribute and one “present/absent” risk factor a 2x2 table of frequencies could be:-

DiseasedNot

diseasedTotal

Risk factor

present a b a + b

Risk factor

absent c d c + d

Total a + c b + da+b+c+d

= n

Page 12: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

12To put your footer here go to View > Header and Footer

Cohort study

This involves selecting, & following through a period of time, individuals some with risk factor present, some absent. Outcome observation = no. with disease at endpoint.

In a general population cohort study only n is fixed. If low general exposure to risk, (a + b) will be small relative to n ~ costly, so where possible (a + b), (c + d) often selected e.g. to be equal sample sizes.

Page 13: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

13To put your footer here go to View > Header and Footer

Cohort study relative risk: 1

With observed frequencies a, b, c, d as above the disease risk (over the study duration) among:-

the risk-factor + group is: a/(a + b)

the risk-factor – group is: c/(c + d)

The relative risk is the ratio of these two risks:- a . (c + d)

(a + b). d

RR =

Page 14: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

14To put your footer here go to View > Header and Footer

Cohort study relative risk: 2

The relative risk is the ratio of these two risks:- a . (c + d)

(a + b). cOften disease rates are relatively low, soa/(a + b) ≈ a/b ;

c/(c + d) ≈ c/d and then

RR ≈ a.d/b.c – described as the “odds ratio”

or “approximate relative risk”, witha/b being odds of getting disease, having the

exposure, c/d odds not having the exposure.

RR =

Page 15: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

15To put your footer here go to View > Header and Footer

Cohort study relative risk: 3

Example ~ population of miners

RR = (58/430)/(27/370) = 1.85

Odds ratio = (58/372)/(27/343) = 1.98

Similar representations of extra risk factor due to occupational asbestos exposure.

Asbestos Lung cancer + No LC i.e. – Total

Exposured 58 372 430

Not exposed 27 343 370

Total 85 715 800

Page 16: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

16To put your footer here go to View > Header and Footer

Case-control relative risk

In a case-control study (module I1, sess. 05) numbers of lung cancer positive “cases” and lung cancer negative “controls” would be fixed by design. RR cannot be calculated, but the same odds ratio can, & is used as approximation to relative risk.

Odds ratios are statistically modelled by professional epidemiologists to account for numerous complicating factors.

Page 17: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

17To put your footer here go to View > Header and Footer

Confounding: 1

Counfounders are “nuisance” variables that make over-simple conclusions misleading!

Example ~ suppose in a study population the TRUE average figures are as below, so tea/coffee drinking adds 4 mg Hg to diastolic blood pressure:-

Average diastolic BP Overweight Not overweight

Tea/coffee drinker 94 74

Non-drinker of tea/coffee 90 70

Page 18: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

18To put your footer here go to View > Header and Footer

Confounding: 2

Now say numbers of individuals in study are:-

If study ignores obesity and calculates simple averages, it could expect diastolic BPs as follows:-

Drinkers: [(94 x 300) + (74 x 100)]/(300 + 100) = 89;

non-drinkers: [(90 x 50) + (70 x 150)]/(50 + 150) = 75.

Misleading 14 mg difference. Confounders only corrected if someone thinks of them!

Numbers of individuals Overweight Not overweight

Tea/coffee drinker 300 100

Non-drinker of tea/coffee 50 150

Page 19: SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)

19To put your footer here go to View > Header and Footer

Practical work follows to ensure learning objectives

are achieved…