1. ph250b.14 measures of disease part 1

Measures of Disease Learning Objectives

Measures of Disease: Learning Objectives1. Understand different types of populations as conceptualized in epidemiology and the relevance

of population types to measures of disease2. Understand concept of disease occurrence in time a. Understand and be able to define concepts of disease occurrence in time at a population level

(age, period, cohort effects)b. Understand and be able to define concepts of disease occurrence in time at an individual level

(i.e., latent period, lead time), and their implications for measuring disease at the population level3. Understand and be able to define and contrast prevalence and incidence4. Understand and be able to define and contrast risks and rates 5. Calculate and interpret prevalence (this includes knowing the formula)6. Understand, define, calculate, and interpret cumulative incidence (this includes knowing the

formulas)a. Know different methods for calculating cumulative incidence and the assumptions and purposes

of each7. Understand, define, calculate, and interpret incidence density (knowing the formulae)a. Understand and calculate person-time8. Define and interpret a hazard rate9. Understand and be able to convert between prevalence, cumulative incidence and incidence

density (this includes knowing the formulas)10. Direct and indirect standardization` a. Perform both and understand when each is appropriate (know the formulae) b. Know what data are required for each

Measures of disease

Module 2PH250B Epidemiologic Methods II

Measures of disease outline– Big picture– Illustration/discussion of measuring disease in time– Populations– Time scales affecting disease in populations– Epidemiologic measures

• Basic concepts• Measuring diseases• Prevalence• Incidence density (incidence rate)• Cumulative incidence (risk)• Relations among measures

– Standardization– Summary– Appendix: specific measures of disease

Big picture

• In epidemiology, one of our major goals is to measure occurrence of disease– Tool for surveillance (the “distribution” of disease;

descriptive epidemiology)– Tool for etiologic/risk factor research (the

“determinants” of disease; analytical epidemiology)

Big picture

• Critical part of etiologic research– We compare measurements of disease between

groups of people (e.g., exposed and unexposed) because we are interested in associations between exposures and outcomes and, ultimately, effects (causal) of exposures on outcomes

Big picture

• Reminder: we compare disease occurrence between groups of people that have different exposures because we do not observe the counterfactual outcomes for each person in the population– Comparisons of disease occurrence covered in next

module – measures of association• Key step in etiologic research process is

accurate measurement of disease occurrence

Big picture

Disease in time• For measuring disease occurrence in a given population

there are two important components– Measuring the disease outcome– Measuring and accounting for the time over which disease

occurs• Rothman: “disease occurrence in a population reflects

two aspects of individual experiences: the amount of time the individual is at risk in the population, and whether the individual actually has the focal event (e.g., gets disease) during that time.”

Disease in time

Modification of Szklo Fig. 2-2 – participants observed every 2 months (vs 1)

Disease in timeSum of time all members of the population are observed is called person-time

People move through time in a study, with and without disease

People are observed for differing amounts of time

Disease in timeHow many people were in our study?

How many got disease?

So we could say 6/10 got disease over 2 years

Disease in time

• Any concerns about this measure?

Disease in time

• How might you account for the time contributed by each person?

Disease in time

• Any concerns about this measure?

Disease in time• If interested in a rate of disease

– # Disease/person-time• If interested in a risk of disease, but want to account for

different times people were observed– ?

Cohort from Fig 2-2 by 2-month intervals

Interval Timeperiod

Numberin at risk

population

Numberdeveloping

disease

Numberwithdrew

Proportion of at-risk population

developing disease

j (tj-1, tj) N’0j Ij Wj Rj = Ij/ (N’0j-(Wj/2))

2 (0, 2) 10 1 1 1/(10-(1/2)) = 0.11

4 (2, 4) 10 - 1 - 1 =8

1 0 1/8 = 0.125

6 (4, 6) 8 - 1 - 0 =7

0 0 0/7 = 0

Adapted from Kleinbaum, Table 6.1

Populations

• Populations in epidemiology– Group of people for whom we are interested in the

occurrence of disease or the effect of an exposure on disease

– Defined by: geography, occupation, demographic characteristics (age, race/ethnicity, gender), time etc.

Populations

• Populations in epidemiology– Examples:

• Residents of NYC on 9/11/2001• Women of childbearing age in Alameda County 1980-2000• Live singleton births in Bangladesh in 2005

Populations

• Total population– Includes everyone in a particular population

• Candidate or “at risk” population – People in the total population who could get the

disease/condition of interest– Excludes those who have the disease or who are

immune (or do not have the necessary organ or physiological function, etc.)

Populations

• Candidate or “at risk” population – Example: candidate population for pregnancy

excludes men, currently pregnant women, women with hysterectomy, and older women

Populations

• Closed or fixed populations– Membership is permanent and defined by some life

event– Add no new members and only lose members to

death– The size of the population will eventually reach 0

because everyone ultimately dies

Populations

• Closed or fixed populations– Examples: being born in 1975, serving in Iraq or

Afghanistan

Populations

• Open populations – Gain members over time through immigration or birth– Lose members through emigration or death– Sometimes called dynamic, but a misnomer b/c both

open and closed populations are changing– If membership can be lost due to events other than

death, then the population is open

Populations

• Open populations – Examples: most populations such as cities, states,

hospital populations, etc.

Populations

• Steady state populations – a type of open population– When the number of persons entering the population

is balanced by the number exiting over a period of time

– Example: a city where the number of people moving out or dying is approximately equal to the number moving in or being born between over a given time interval

– Example: population of women in the maternity ward at Alta Bates hospital

Populations

• Distinctions can depend on measurement of time or disease– Example: a population that starts a new drug could

be considered closed if only the population starting at a particular time is included but if new users of the drug are allowed to enter the population it could be considered open

Populations

• Relevance– Population properties are important to consider in

study planning• Example: when studying a particular outcome (e.g.,

pregnancy) need to make sure you study a population “at-risk” of that outcome (e.g., women of certain ages)

• Example: should define your study population so that you can address your study question in that population (e.g., differences in PTSD between OEF/OIF Veterans and civilians vs differences in PTSD among OEF/OIF Veterans)

Populations• Example: studying exposure to a fixed event (e.g., hurricane

Katrina) population of interest is fixed/closed and a study would need to be designed to capture that population appropriately

• Example: a population of interest may be open (e.g., tourists visiting a given city) and a study would need to be designed to capture that population appropriately

– Important to consider in calculation and interpretation of measures of disease (more later in relations between measures)

Time scales

• Disease occurrence at a population level affected by different time scales– age, period and cohort effects

Time scales

• People are conceived, born, and then move through time until death with a variety of health states and events along the way

• Time (and exposures in time) can affect health/disease in three main ways – Age effects: biological age of individuals– Period effects: calendar time– Cohort effects: year of birth

Time scales

• Age effect• Definition: variation in health status arising from

social or biological consequences of aging• What it looks like when graphed

– Rate (of disease) changes with age– Irrespective of birth cohort and calendar time

Time scales

• Age effect• Example: rate of heart disease increases with

age regardless of whether you examine a population born in 1900 or 1950; at the age of 50 the rate of heart disease is higher than at age 30

Time scales

• Age effect - depiction

Time scales

• Period effect• Definition: Variation in health status arising from

changes in physical, ecological, or social environment during a time period

• What it looks like– Change in rate (of disease) affecting an entire

population at some point in time– Irrespective of age and birth cohort

Time scales

• Period effect• Example: DDT spraying in 1950s led to

increased risk of certain cancers for anyone living in affected areas in the 1950s, regardless of how old they were or when they were born

Time scales

• Period effect - depiction

Time scales

• Cohort effect • Definition: Variation in health status arising from

exposures that vary by cohort• What it looks like:

– Change in the rate (of disease) according to membership in some cohort

– Birth cohort is established by year of birth• Note that one can examine cohorts defined by any life event

which places a person permanently in a group– Irrespective of age and calendar time

Time scales

• Cohort effect • Example: Women exposed to DES in utero

have increased risk of vaginal and cervical cancer at all ages and over all time periods

Time scales

• Cohort effect - depiction

Time scalesA real example• Peptic ulcer mortality

(Susser 1982, reprinted 2001) – Cohort effects – for

those born after 1900 age-specific mortality from peptic ulcer was continually declining

Measures – basic concepts

• Proportion• Numerator is included in the denominator (a/

(a+b))• Range: 0 to 1 (or 0% to 100%)• Example: number of students with tattoos/total

number of students in class (number with tattoos + number without tattoos)


• Ratio• Numerator is NOT included in the denominator

(a/b)• Range: 0-infinity• Example: number with tattoos/number without

tattoos (odds of tattoo)• Example: number of hospital beds/number of

patients• In epidemiology, you will see ratios of

probabilities, rates, and odds (to be elaborated later)


• Odds– A ratio with wide application in epidemiology (more in

measures of association, study designs, analysis of epidemiologic data)

• Odds of disease: number with disease/number without disease


• Rate• Time is in the denominator• Range: 0-infinity• Examples: cases of flu/month, miles per hour• Dimension is always 1/time or time-1

Measures – measuring diseases

• Disease process and measuring disease– Induction period = time from causal action to

biological onset– Latent period = time from biological onset to disease

detection

Biologic onset Detectable by screening

Symptoms develop

DeathCausal action


• Disease process and measuring disease– Timing of disease process may differ between

individuals– Timing of detection may differ between individuals


Symptoms develop

DeathCausal action


• Disease process and measuring disease– Timing of disease process may differ between

individuals


Symptoms develop

DeathCausal action

Biologic onset

Detectable by screening

Symptoms develop

Death

Causal action

A

B

JC: mention length bias


• Disease process and measuring disease– Timing of detection may differ between individuals


Symptoms develop

DeathCausal action


Symptoms develop

DeathCausal action

A

B

JC: mention lead time bias


• Defining disease outcome for a study– Have to consider underlying disease process and

potential variations in that process– Have to consider how disease is being detected

• This will influence what your measure of disease is capturing


• Example: prevalence of cancer (proportion with disease at a particular time) will miss cases of aggressive cancers

Epidemiologic measures

• Prevalence vs. incidence– Prevalence = proportion of the population with a

disease– Incidence = frequency of development of new cases

of disease in a population • New case is usually the first occurrence of a

disease for a non-diseased person

Epidemiologic measures• Risk vs. rate• Risk = the probability of developing disease over a

specified time period– Population measure that is often interpreted at the individual

level– Must specify the time period for the risk to be meaningfully

interpreted (X-year risk)– Example: 10 year risk of mortality among men diagnosed with

prostate cancer is 0.1 or 1/10 men diagnosed with prostate cancer die within 10 years

Epidemiologic measures• Risk vs. rate• Rate (average) = average change in disease status per

unit of time over a time period relative to the size of the candidate population (incidence density)

• Example: There are 78 new cases of lyme disease per 100,000 population per year in CT (estimated in 2008)

• Interpreted at population level• A rate, so time is in the denominator

Epidemiologic measures• Risk vs. rate• Rate (instantaneous) = the instantaneous potential for

change in disease status per unit of time at time t relative to the size of the candidate (i.e., disease-free) population at time t (hazard)

• The instantaneous rate (hazard) of lyme disease on August 31, 2008 in CT is ?– Instantaneous rates cannot be directly calculated from

epidemiologic data because they are defined for an infinitely small time interval

– We can estimate average rates for smaller time intervals when we have sufficient data

Measures - prevalence

Prevalence• Proportion of existing disease in the total population,

without regard to when cases developed • Numerator: number of existing cases of disease in the

population• Denominator: number of all persons in the population of

interest• A proportion• Range is 0-1 - dimensionless• Prevalence odds = prevalence of outcome/prevalence

of no outcome = P/(1-P)


Two types of prevalence measures:• Point prevalence: the proportion of subjects who have

disease at a specified point in time– Example: proportion of population that is HIV positive on July 1,

2010


Two types of prevalence measures:• Period prevalence: the proportion of subjects in a

population who have disease during a certain period of time– Uncommon - used when exact time of onset difficult to

determine– Example: proportion of population with an episode of

depression over the past 12 months


Uses and limitations of prevalence• A disease that has high incidence but is rapidly fatal or

quickly cured would have low prevalence• An exposure that increases survival with the disease will

increase prevalence• Useful for resource planning• Can estimate the rate under certain conditions (more to

come)


Uses and limitations of prevalence• In measuring congenital anomalies we use prevalence

out of necessity (many incident cases are lost, as are others in the denominator)– Cannot measure the population at-risk (conceptions)

or person-time contributed by the population, so we necessarily take a point prevalence—the point being birth


Side note: Szklo and “prevalence rate”• Szklo uses the term “prevalence rate” for prevalence• Although you will see this in other places in the literature

as well, you should not use this term• Use the term prevalence• Prevalence is not a rate and thus the term “prevalence

rate” is incorrect and potentially confusing

Measures - incidence

Incidence time• Not sufficient to just record proportion of population

affected by disease• Necessary to account for the time elapsed before

disease occurs and the period of time during which the disease events take place

Measures - incidenceA

B

P(DA) = 0.6

P(DB) = 0.6


Incidence time• Incidence time is time from referent or zero time (e.g.,

birth, start of treatment or exposure, start of measurement period) until the time at which the outcome event occurs

• Also called event time, failure time, occurrence time


Incidence time• “Censoring” occurs if the time of event is not known

because something happens before the outcome occurs – Examples: lost to follow-up, death, surgery to make outcome

impossible like hysterectomy, end of measurement period

• Average incidence time = average time until an event occurs

Measures – incidence densityIncidence density (ID) - aka incidence rate (IR)• The rate of occurrence of new cases of disease during

person-time of observation in a population at risk of developing disease

• Numerator: number of new cases of disease– Only count cases in the numerator that are contributing to

person-time in the denominator• Denominator: person-time of observation in population

at risk– Only count contributions to the denominator that could yield

cases for the numerator• A rate• Units are “inverse time” (1/time, time-1)• Range is 0-infinity

Measures – incidence density

Incidence density• What is “person-time”?• Person-time at risk: length of time for each individual

that they are in the population at risk– Sum over population is total person time at risk

• When a person is no longer “at risk” they cease contributing to person-time, this includes when they get the outcome of interest

• One person year could be 2 people x 6 months each, 1 person x 12 months, 3 people x 4 months, etc.

• Helps account for censoring and different observation periods

Measures – incidence density“Figure 2 suggests that ID may be viewed as theconcentration or 'density' of new case occurrencesin a sea of population time. The more dots per unit area under the curve, the greater is the ID.”

Morgenstern et al. 1980


Person-time calculations for individual level data

1) If exact time contribution of each individual is known:– Sum the disease-free observation time


Person-time calculations for individual level data

2) If data on each individual is collected at regular intervals:– Estimate the disease-free observation time in each

interval

– Note: variants of this formula also subtract Ij/2 from N’0j

Measures – incidence densityPerson-time estimation from group level data1) If the population is in steady state can estimate based on

population size (N’) and duration of follow-up (Δt)

2) If the population is not in steady state can estimate based on mid-interval population (N’1/2) and duration of follow-up (Δt)

– Note: mid-interval population size can be estimated as: (Nt0 + Nt1)/2


Uses and limitations of incidence density• Appropriate for fixed or dynamic populations; does not

assume that everyone is followed for specified time period

• Does not distinguish between people who do not contribute to disease incidence because they were not in the study population long enough for disease to develop and those who do not contribute because they never got the disease (relates to next point)


Uses and limitations of incidence density• 100 person-years could come from following 100 people

for one year or two people for 50 years – no way to tell the difference without knowing the incidence time– Have to consider whether study design allowed appropriate time

to elapse to plausibly consider an exposure disease relation– Disease process is important to consider in developing

appropriate study design and disease measures– Example: disease free cohort of 50 exposed and 50 unexposed

followed for 1 year might not allow sufficient time to elapse for exposure to cause disease


• In class exercise

– Study population observed monthly for 6 months– What is the person-time contributed by this

population?– What is the incidence density?

Measures – incidence

Hazard rate• The instantaneous potential for change in disease

status per unit of time at time t relative to the size of the candidate (i.e., disease-free) population at time t

• Instantaneous rate in contrast to incidence density which is an average rate

• Cannot be directly calculated because it is defined for an infinitely small time interval

• Hazard function over time can be estimated using modeling techniques (more in the analyzing epidemiologic data section)


Hazard rate


Survival function

Measures – cumulative incidence

Cumulative incidence (CI) – aka risk, incidence proportion (IP – Rothman)

• The proportion of a closed population at risk that becomes diseased within a given period of time

• Numerator: number of new cases of a disease or a condition (Rothman calls this A)

• Denominator: number of persons in population at risk (Rothman calls this N)

• A proportion• Range is 0-1 – dimensionless


Cumulative incidence• Calculated for a fixed time period

– Only interpretable with information on time period over which it was measured

• Population measure that translates most readily to individual– Interpreted as capturing individual risk of disease

• Different methods for calculating– Variations depending on how time at risk is handled– Option for calculating from rate measure


• Different methods for calculating– Simple cumulative– Actuarial– Kaplan-Meier– Density


• Subscript notation– R(t0,tj) – risk of disease over the time interval t0

(baseline) to tj (time j)– R(tj-1,tj) – risk of disease over the time interval tj-1

(time before time j) to tj (time j)


• Subscript notation– N’0 – number at risk of disease at t0 (baseline)– N’0j – number at risk of disease at the beginning of

interval j


• Subscript notation– Ij – incident cases during the interval j– Wj – withdrawals during the interval j


Simple cumulative method: R(t0,tj) = CI(t0,tj) = I

N'0

• Risk calculated across entire study period assuming all study participants followed for the entire study period, or until disease onset– Assumes no death from competing causes, no withdrawals

• Only appropriate for short time frame


Simple cumulative method:• Example: incidence of a foodborne illness if all those

potentially exposed are identified

Measures – cumulative incidenceActuarial method:

R(tj-1, tj) = CI(tj-1, tj) =____Ij____ N'0j - Wj/2

• Risk calculated accounting for fact that some observations will be censored or will withdraw

• Assume withdrawals occur halfway through each observation period on average

• Can be calculated over an entire study period– R(t0,tj) = CI(t0, tj) = I/(N’0-W/2)

• Typically calculated over shorter time frames and risks accumulated


Modification of Szklo Fig. 2-2 – participants observed every 2 months (vs 1)

• Where to start – set up table with time intervals• Fill incident disease cases and withdrawals into appropriate

intervals• Fill in population at risk

Measures – cumulative incidenceActuarial Method

• Calculate interval risk• R(tj-1, tj) = Ij/(N’0j-(Wj/2))

• R(0,2)=1/(10-(1/2)) = 0.11


• Calculate interval survival• S(tj-1,tj) = 1-R(tj-1,tj)


• Calculate cumulative risk – example of time 0 to 10• R(t0, tj) = 1 - Π (1 – R(tj-1,tj)) = 1 - Π (S(tj-1,tj))• R(0, 10) = 1 – (0.89 x 0.88 x 1.0 x 1.0 x 0.85) = 0.34


• Calculate cumulative survival • S(t0,tj) = 1-R(t0,tj)


• Intuition for why R(t0, tj) = 1 - Π (Sj) using conditional probabilities

• Example of 5 time intervals:– Π (Sj) = P(S1)*P(S2|S1)*P(S3|S2)*P(S4|S3)*P(S5|S4)

= P(S5)– Product first two terms: P(S2|S1)*P(S1) = P(S2) – Multiplying conditional probabilities gives you

unconditional probability of surviving up to any given time point

– the value (1 - survival) up to (or at) a given time point is then the probability of not surviving up to that time point



• Exercise for home (discuss in lab)

– Study population observed monthly for 6 months– Calculate the cumulative incidence of disease from

month 0 to 6

1. ph250b.14 measures of disease part 1

Health & Medicine