1 epi-820 evidence-based medicine (ebm) lecture 2: medical measurement mat reeves bvsc, phd...

36
1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

Upload: belinda-alexander

Post on 22-Dec-2015

223 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

1

EPI-820 Evidence-Based Medicine (EBM)

LECTURE 2: MEDICAL MEASUREMENT

Mat Reeves BVSc, PhD

Department of Epidemiology

Michigan State University

Page 2: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

2

Objectives:• 1. Understand biological and measurement variation

and its effects on precision and validity.• 2. Understand the components of variability

– biological and measurement – between- and within-person/observer

• 3. Understand measures of variation and measures of agreement.

• 4. Understand the calculation and application of K.• 5. Understand the consequences of variability in

clinical data and possible remedies to ameliorate• 6. Understand regression to the mean.

Page 3: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

3

I. Variation in Clinical Data

• 1. Biologic Variation= variation in the actual entity being measured

• derives from the dynamic nature of physiology, homeostasis and pathophysiology.

• within (intra-person) biologic variability and,• between (inter-person) biologic variability

Page 4: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

4

Within (day-to-day variation) and Between Person Biological Variation: Coefficient of Variation (%) (see Winkel et al, 1974)

• Variable CV (Within) CV (Between)• Na 0.7% 0.8%• K 4.3% 4.3%• Cl 2.1% 1.2% • Ca 1.7% 2.8%• BUN 12.3% 16.4%• Creatinine 4.3% 9.5%• Cholesterol 5.3% 13.6%• SGOT (ALT) 24.2% 24.8%• TP 2.9% 5.7%

Page 5: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

5

I. Variation in Clinical Data

• 2. Measurement Variation= variation due to the measurement process

• inaccuracy of the instrument (instrument error), and/or,

• inaccuracy of the person (operator error)

• can introduce both random error and bias

Page 6: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

6

Analytical Variation - Coefficient of Variation (%) of Duplicate Samples

• Variable CV (Analytical)• Na 1.1%• K 2.6%• Cl 2.1%• Ca 2.1%• BUN 2.2%• Creatinine 3.4%• Cholesterol 3.1%• SGOT (ALT) 7.3%• TP 1.7%

Page 7: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

7

Validity

• Degree to which a measurement process measures what is intended i.e., accuracy.

• Lack of systematic error or bias.   

• A valid instrument will, on average, be close to the underlying true value.

• Assessment of validity requires a “gold standard” (a reference).

Page 8: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

8

What if no gold standard? (e.g., pain, nausea or anxiety)

• Use instrument or clinical scale to measure a specific phenomenon or construct.  • Criterion Validity - the degree to which the scale predicts a

directly observable phenomenon e.g. APGAR score and neonatal survival.  

• Content Validity - the extent to which the instrument includes all of the dimensions of the construct being measured e.g. does APGAR include all relevant patho-physiological parameters?

• Construct Validity - the degree to which the scale correlates with other known measures of the phenomenon e.g. how well does a new “Neonatal assessment scale” correlate with APGAR score?  

Page 9: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

9

How do you measure validity?

• Dichotomous data • sensitivity, specificity, and predictive values.

• Continuous data • mean and standard deviation of the difference

between surrogate measure and gold standard (see Bland and Altman, 1986).

Page 10: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

10

Precision (or reliability or reproducibility)

• the extent that repeated measurements of a phenomenon tend to yield the same results (regardless of their accuracy!).

• Precision refers to the lack of random error

• Precision ~ 1 / random error   

Page 11: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

11

Hard versus Soft Data ?

• Blood chloride level

• Left ventricular ejection volume

• Migraine severity

• 28-d stroke case-fatality rate 

• Indirect costs of school absenteeism 

• Direct costs of school absenteeism 

• Degree of depression 

• Alzheimer severity 

• Self-reported ability to do domestic chores 

• Self-reported ability to climb stairs 

• Patient preferences for induced labour 

• Self-reported assessment of health

Page 12: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

12

Hard versus Soft Data

• No specific criteria to define “hard” data, attributes include:• Consistency: the ability to preserve basic

evidence (repeated observations are consistent) (most important attribute).

• Objectivity: observations are free of subjective influences.

• Quantifiable: the ability to express the result as a number.

Page 13: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

13

Hard versus Soft Data

• Usually hard data are numeric measures, such as lab data, but not always (e.g., histology, cancer stage)

• Hard (numeric) data preferred to softer (qualitative) measures because they are more objective and reliable? (but see Feinstein AR et al, 1985, Will Rogers phenomenon)

Page 14: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

14

Between and Within Person Variation

• Four categories of clinical variability:

• 1. Between-person biological variability • 2. Within-person biological variability • 3. Between-observer measurement variability • 4. Within-observer measurement variability

Page 15: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

15

ANOVA Model Conceptualization

• yijkl = i + ij + ik + il

• where:– yijk = the observed measurement for individual i, measured at

time j, by the kth observer at the lth replication. i = individuals usual true mean (between person biological

variation) ij = perturbation due to biological variation at time j (within

person biologic variation). ik = perturbation due to measurement error by the kth observer

(between observer measurement variation). – il = perturbation due to measurement error at the lth replication

(within observer measurement variation).

Page 16: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

16

II. Statistical aspects of variability

• A. Measures of Variation• 1. Variance and Standard Deviation

• SD = absolute value of average differences of individual values from the overall mean.

• CLT = 68%, 95%, 99%• Example:

– Av. US Cholesterol = 220 mg/dl, SD = 15 mg/dl– Indv. readings expected to vary 190-250 mg/dl

1 - n

)x - x( = SD

2i

Page 17: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

17

A. Measures of Variation

• represents the % variation of a set of measurements around their mean

• conceptualized as a “noise-to-signal ratio”• useful index for comparing the precision of

different instruments, individuals and/or laboratories.

X

SD%

• 2. Co-efficient of Variation (CV)

Page 18: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

18

B. Measures of Agreement

• 1. Correlation (r) • Pearson product moment correlation and

Spearman’s rank correlation

• measures the degree of linear relationship between two variables (-1, +1)

• correlation between two sets of continuous measurements (= reliability) or extent of replication

Page 19: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

19

1. Correlation (Cont’d)

• Two observers, same time period = inter-rater reliability.

• Single observer, two time periods = intra-rater reliability (test-retest reliability).

• Can have very high values of r, but little direct agreement between raters or instruments.

• Can only be used as a test of validity if the actual true values are known.

Page 20: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

20

B. Measures of Agreement

2. Intra-class Correlation Coefficient

(R or reliability)• a measure of reliability for continuous or quantitative data • an observed value (X) consists of two parts:• X = T + e

– where:

• T = the “True” unknown level or “error-free” score or “steady state” or “signal”

• e = error (whether “biologic” or “measurement” error)

• true error-free value varies about some unknown mean () with a variance of 2

T.

Page 21: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

21

2. R (Cont’d)• error term is regarded as iid ( = 0, 2

e ). • Variance of X (2

x ) = 2T + 2

e • relative size of error variance (2

e) in relation to variance of true value (2

T ) is a measure of the imprecision.

• R = 2T.

2T + 2

e

• R = the proportion of the total variance due to subject-to-subject (or between-person) variability in the “true” value.

• As random error decreases, the value of R increases

Page 22: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

22

2. Categorical data – Kappa (K)

• A measure of reliability for categorical or qualitative data.

• Kappa corrects for the degree of chance in the overall level of agreement, and is preferred over other measures (like overall percent agreement).

• K = Po - Pe = Actual agreement beyond chance 1 - Pe Potential agreement beyond chance

• Po = the total proportion of observations on which there is agreement

• Pe = the proportion of agreement expected by chance alone.

Page 23: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

23

Agreement matrix for kappa statistic (inter-rater agreement, 2 observers, dichotomous data)

  

 

OBSERVER B

 

OBSERVER A  

 

Yes 

No 

TOTALS

 

Yes 

f1

 

No 

f2

 

TOTALS 

n1

 

n2

 

N

Page 24: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

24

Agreement matrix for kappa statistic (2 observers, dichotomous data)

  

 

OBSERVER B

 

OBSERVER A  

 

Yes 

No 

TOTALS

 

Yes 

69 

15 

84

 

No 

18 

48 

66

 

TOTALS 

87 

63 

150

Page 25: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

25

K (Cont’d)

• Observed agreement (Po) = 78%

• (69 + 48)/150 = 0.78 or 78%.

• Agreement expected dt chance (Pe) = 51%.

• Calculated by the product of the marginal totals for cells a and d [87 x 84/150 = 48.75 + 63 x 66/150 = 27.72]

• Then divide sum [76.47] by 150 to get Pe = 0.51 or

51%.

Page 26: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

26

K (Cont’d)  

• K = Po - Pe = 0.78 - 0.51 = 0.27 = 0.55 or 55% 1 - Pe 1 - 0.51 0.47

• Kappa varies from -1 to +1, with a value of zero denoting agreement no better than chance (negative values denotes agreement worse than chance!)  

• Value of k Strength of agreement <0 Poor0 - 0.20 Slight0.21 - 0.40 Fair0.41 - 0.60 Moderate0.61 - 0.80 Substantial0.81 - 1.0 Almost perfect

Page 27: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

27

K - Issue of Prevalence

• The prevalence of condition affects the likelihood that observers will agree purely due to chance - hence the importance of using kappa. Example: • Observer A classified 120/150 patients• Observer B classified 130/150 patients

• Pe is now 72%.

Page 28: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

28

K - More Complicated Scenarios

• Overall (summary) kappa:• several observers or raters and/or where the subjects are

classified into several different categories. 

• Weighted kappa:• measuring the relative degree of disagreement when subjects

are classified into several ordinal categories (e.g., normal, slightly abnormal and very abnormal).  

• MacClure and Willett (1987): • Use kappa for dichotomous data or nominal polytomous data

only. • For ordinal data use either Spearman’s rank correlation or R.

Page 29: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

29

IV. Consequences of variability of clinical data

• A. Clinical impact• Errors in diagnosis, prognosis and even treatment.• Clinical disagreement between clinicians.

• B. Research Impact• Between-person biological variability is a prerequisite for

etiologic studies. • Random within-person variability (a form unreliability) results

in non-differential misclassification - with a resulting dilution or attenuation of effect.

Page 30: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

30

B. Research impact

• Generally, imprecision has less impact in research setting than individual clinical setting because can average over a large number of observations (but still require measure to be valid).

• Variability and misclassification result in the need for larger samples sizes (and increased costs).

• Measurement errors can introduce bias if they do not occur at random - non-differential misclassification

Page 31: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

31

Regression Dilution Bias

• Example: MacMahon et al., (1990)

• imprecision resulting from a single measurement of diastolic blood pressure resulted in a 60% attenuation of RR’s (for the effect of elevated blood pressure on stroke and MI).

• “regression dilution bias”.

Page 32: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

32

C. Regression towards the mean

• Group of individuals selected based on the results of an “abnormal” test can be divided into:• a) those with a true underlying abnormal value, and• b) those with a true underlying normal value (but random

fluctuations resulted in an outlying [abnormal] value).

• On retesting, patients in group b are closer to their typical (normal) values, so, the overall mean is less extreme (= regression to the mean).

• Occurs when repeated observations are performed on a variable that is inherently variable.

Page 33: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

33

C. RTTM• Often interpreted as a sign of clinical improvement,

regardless of effectiveness of treatment (an important explanation for the placebo effect)

• If first reading is d units higher than the true value (), then on average, the next value will be closer to the mean by d(1 - r) units, • where r is the correlation between the two measurements• RTTM increases if d is large and r is small.

• RTTM is a general tendency for describing the average behaviour of a group, not necessarily individuals!!

Page 34: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

34

V. Remedies for variability of clinical data• A. Within-person biologic variation

• Standardized measurements: use a standard protocol i.e., time of day, body position etc.

• Average repeated tests e.g., take several blood pressure reading.

• Use a less variable test e.g., for diabetes use glycosolated Hb, rather than blood glucose.

• Plot the data - what is the trend?• Develop reference values for each individual - especially if:

– within-person variability <<< between-person variability – this results in a wide reference range which makes it difficult to

identify individual deviations – e.g., body weight, PSA, EKG

Page 35: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

35

B. Measurement Error

• Measurement imprecision corrected by adjusting the machine or re-training the tester, (or, average several values?).

• Measurement error that causes bias requires quality assurance testing. Fix by re-calibration (don’t average!!).

Page 36: 1 EPI-820 Evidence-Based Medicine (EBM) LECTURE 2: MEDICAL MEASUREMENT Mat Reeves BVSc, PhD Department of Epidemiology Michigan State University

36

Sackett - Six strategies for preventing or

minimizing clinical disagreements

• 1. Match diagnostic environment to the diagnostic task.• 2. Corroborate key findings by:

– repeating observations and questions– confirm information with other sources (e.g., family members) – confirm key findings using appropriate diagnostic tests– seek confirmation from “blinded” colleagues

• 3. Report actual findings then report inference • 4. Use appropriate technical aids to avoid imprecision

(e.g., ruler).• 5. “Blinded” assessments of diagnostic findings.• 6. Apply skills of social sciences

– establish understanding, follow a logical order, listen, observe, interrupt only where necessary).