Download - SENSITIVITY,SPECIFICITY,AND RELATIVES - …zoe.bme.gatech.edu/~bv20/isye3770/Bank/SeSp.pdfSe= 1−[(1−Se 1) ×(1−Se 2)× ···×(1−Sek)] Sp= Sp 1 ×Sp 2 × ···×Spk. The

SENSITIVITY, SPECIFICITY, ANDRELATIVES

BMED 6700

February 16, 2012

Overview

Definitions of Sensitivity, Specificity, Positive and Negative

Predictive Values, Likelihood ratio Positive and Negative, Measure

of Agreement.

Performance of Tests: ROC Curves and Area Under ROC.

Assessing Combined Tests

The following problem was posed by Casscells, Schoenberger, and

Grayboys (1978) to 60 students and staff at an elite medical school:

If a test to detect a disease whose prevalence is 1/1000 has a false

positive rate of 5%, what is the chance that a person found to have

a positive result actually has the disease, assuming you know

nothing about the person’s symptoms or signs?

Assuming that the probability of a positive result given the

disease is 1, the answer to this problem is approximately 2%.

Casscells et al. found that only 18% of participants gave this

answer. The most frequent response was 95%

Fundamental 2 x 2 Table

disease (D) no disease (C) total

test positive (P) TP FP nP=TP+FP

test negative (N) FN TN nN=FN+TN

total nD=TP+FN nC = FP + TN n

TP true positives (test positive, disease present)

FP false positives (test positive, disease absent)

FN false negatives (test negative, disease present)

TN true negatives (test negative, disease absent)

nP total number of positives (TP + FP)

nN total number of negatives (TN + FN)

nD total number with disease present (TP + FN)

nC total number without disease present (TN + FP)

n total sample size (TP+FP+FN+TN)

Definitions

Sensitivity (Se) Se = TP/(TP + FN) = TP/nD

Specificity (Sp) Sp = TN/(FP + TN) = TN/nC

Prevalence (Pre) (TP + FN)/(TP + FP + FN + TN)= nD/n

Positive Predictive Value (PPV) PPV = TP/(TP + FP) = TP/nP

Negative Predictive Value (NPV) NPV = TN/(TN + FN) = TN/nN

Likelihood Ratio Positive (LRP) LRP = Se/(1-Sp)

Likelihood Ratio Negative (LRN) LRN = (1-Se)/Sp

Apparent Prevalence (APre) APre = nP/n

Concordance, Agreement (Ag) Ag =(TP + TN)/n

Imagine a test that classifies all subjects as positive – trivially

the sensitivity is 100%. Since there are no negatives, the specificity

is zero. Likewise, the test that classifies all subjects as negative has

a specificity of 100% and zero sensitivity.

One of the most important measures is Positive Predictive Value,

PPV. It is the proportion of true positives among all positives,

TP/nP. Correct only if the population prevalence is well estimated

by nD/n, that is, if the table is representative of its population.

If the table is formed from a convenience sample, the prevalence

(Pre) would be an external information, and positive predictive

value is calculated as

PPV =Se× Pre

Se× Pre + (1− Sp)× (1− Pre).

Why is the Positive Predictive Value so important? Imagine

almost perfect test for a particular disease, with sensitivity 100%

and specificity of 99%. If the prevalence of the disease in

population is 10%, then approximately among 10 positives there

would be one false positive. However, if the population prevalence

is 1/10000, then approximately for each true positive there would

be 100 false positives.

The Likelihood Ratio Positive represents the odds that a positive

test result would be found in a patient with, versus without, a

disease.

The The Likelihood Ratio Negative represents the odds that a

negative test result would be found in a patient without, versus

with, a disease. For example,

Post-test Disease Odds = LRP × Pre-test Disease Odds.

Post-test No-Disease Odds = LRN × Pre-test No-Disease Odds.

D-Dimer Example

The data below consist of quantitative plasma D-dimer levels

among patients undergoing pulmonary angiography for suspected

pulmonary embolism (PE). The patients who exceed the threshold

of 500 ng/mL are classified as positive.

The gold standard for PE is the pulmonary angiogram.

Goldhaber et al. (1993), from Brigham and Women’s Hospital at

Harvard Medical School, considered a population of patients who

are suspected of PE based on a battery of symptoms. The

summarized data for 173 patients are provided in the table below:

acute PE present no PE present total

positive (D-d ≥ 500 ng/mL) 42 96 138

negative (D-d < 500 ng/mL) 3 32 35

total 45 128 173

function [se sp pre ppv npv lrp ag yi] = sesp(tp, fp, fn, tn)

%

%INPUT: 2x2 Contingency (Confusion) Table

% tp-true positives; fp-false positives;

% fn-false negatives; tn-true negatives

%---------

% OUTPUT

% se-sensitivity, sp-specificity, pre-prevalence(for random sample)

% ppv-positive predictive value, npv-negative predictive value,

% lrp - likelihood ratio positive, ag-agreement

% EXAMPLE OF USE:

% D-dimer as a test for acute PE (Goldhaber et al, 1993)

% [s1, s2, p1, p2, p3, lr, a, yi]=sesp(42,96,3,32);

%

[a b c d e f g h] = sesp(42,96,3,32);

Se Sp Pre PPV NPV LRP Ag Yi

0.9333 0.2500 0.2601 0.3043 0.9143 1.2444 0.4277 0.1296

k Independent Tests, Parallel and Serial Strategies.

In the parallel strategy the combination is positive if at least one

test is positive. It is negative if all tests are negative.

Se = 1− [(1− Se1)× (1− Se2)× · · · × (1− Sek)]

Sp = Sp1 × Sp2 × · · · × Spk.

The overall sensitivity is larger than any individual sensitivity, and

specificity smaller than any individual specificity.

In the serial strategy, the combination is positive if all tests are

positive. It is negative if at least one test is negative.

Se = Se1 × Se2 × · · · × Sek.

Sp = 1− [(1− Sp1)× (1− Sp2)× · · · × (1− Spk)]

The overall sensitivity is smaller than any individual sensitivity,

while the specificity is larger than any individual specificity.

Parikh et al (2008) provide an example of combining two tests for

sarcoidosis. Ocular sarcoidosis is an idiopathic multi-system

granulomatous disease, where the diagnosis is made by a

combination of clinical, radiological and laboratory findings. The

gold standard is a tissue biopsy showing noncaseating granuloma.

Angiotensin-converting enzyme (ACE) test has a sensitivity of

73% and a specificity of 83% to diagnose sarcoidosis.

Abnormal gallium scan (AGS) has a sensitivity of 91% and a

specificity of 84%.

Though individually the specificity of either test is not

impressive, for the serial combination, the specificity becomes

Sp = 1− (1− 0.84)× (1− 0.83) = 1− (0.16× 0.17) = 0.97.

The combination sensitivity becomes 0.73× 0.91 = 0.66.

The ROC is defined as a graphical plot of sensitivity vs. (1 -

specificity).

To increase apparently low specificity in the previous D-dimer

analysis the threshold for a positive is increased from 500 ng/mL to

650 ng/mL.

acute PE present no PE present total

positive (D-d ≥ 650 ng/mL) 31 33 64

negative (D-d < 650 ng/mL) 14 95 109

total 45 128 173

This new table results in

[a b c d e f] = sesp(31,33,14,95);

Se Sp Pre PPV NPV LRP Ag YI

0.6889 0.7422 0.2601 0.4844 0.8716 2.6721 0.7283 0.7283

Combining this with the output of 500 ng/mL threshold, we get

vectors 1-sp = [0 1-0.7422 1-0.25 1], and se = [0 0.6889

0.9333 1].

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sen

sitiv

ity

1−specificity

(0.2578, 0.6889)

(0.75, 0.9333)

Area = 0.7297

The m-file RocDdimer.m plots this “rudimentary” ROC curve.

The curve is rudimentary since it is based on only two tests.

Note that points (0,0) and (1,1) always belong to ROC curves.

These two points correspond to the trivial tests in which all

patients test negative or all patients test positive.

The area under the ROC curve (AUC), is a well accepted

measure of test performance. The closer the area is to 1, the more

unbalanced the ROC curve, implying that both sensitivity and

specificity of the test are high. It is interesting that some

researchers assign an academic scale to AUC as an informal

measure of test performance.

AUC performance

0.9-1.0 A

0.8-0.9 B

0.7-0.8 C

0.6-0.7 D

0.0-0.6 F

MATLAB code auc.m calculates AUC when the vectors

1-specificity and sensitivity are supplied.

function A = auc(csp, se)

% A = auc(csp,se) computes the area under the ROC curve

% where ’csp’ and ’se’ are vectors representing (1-specificity)

% and (sensitivity), used to plot the ROC curve

% The length of the vectors has to be the same

Exercise 1. HAAH Improves the Test for Prostate

Cancer.

A new procedure based on a protein called human aspartyl

(asparaginyl) beta-hydroxylase, or HAAH, adds to the accuracy of

standard prostate-specific antigen (PSA) testing for prostate

cancer. The findings were presented at the 2008 Genitourinary

Cancers Symposium (Keith et al, 2008).

The research involved 233 men with prostate cancer and 43 healthy

men, all over 50 years old. Results showed that the HAAH test had

an overall sensitivity of 95% and specificity of 93%.

(a) From the reported percentages, form a table with true positives,

false positives, true negatives and false negatives (tp, fp, tn, and

fn). You will need to round to the nearest integers since the

specificity and sensitivity were reported as integer percents.

(b) Suppose that for the men of age 50+ in US, the prevalence of

prostate cancer is 7%. Suppose, Jim Smith is randomly selected

from this group and tested positive with HAAH test. What is the

probability that Jim has prostate cancer.

(c) Suppose that Bill Schneider is randomly selected person from

the sample of n = 276 (= 233 + 43) subjects involved in the HAAH

study. What is the probability that Bill has prostate cancer if he

tests positive and no other information is available. How do you

call this probability? What is different here from (b)?

Solution:

(a) Recall that sensitivity is the ratio of true positives and total

number of subjects with the disease. Since 233 subjects are with

the disease, the sensitivity of 95% means that there are

233 · 0.95 = 221.35 ≈ 221 true positives. Thus tp = 221. This

gives 233− 221 = 12 false negatives, thus fn = 12.

Similarly, 43 subjects do not have disease. Since specificity is 0.93,

the true negatives are 43 · 0.93 = 39.99 ≈ 40. This means tn=40 and

fp = 3. The table is

disease no disease total

test positive tp=221 fp=3 tot.pos = 224

test negative fn=12 tn=40 tot.neg = 52

total tot.dis=233 tot.ndis=43 total=276

(b) If the prevalence is an external info,

P ( disease | test positive)

=sensitivity · prevalence

sensitivity · prevalence + (1-specificity) · (1- prevalence)

=221/233× 7/100

221/223× 7/100 + 3/43× 93/100

= 0.5058

PPV is 0.5058

(c)

If the table is representative of population,

PPV =tp

tp+ fp= 221/224 = 0.9866 .

PPV is 0.9866

In both (b) and (c) we have found positive predicted value, that is

P ( disease present | test positive). However, (b) and (c) differ in

the information where the subject comes from, which is critical for

the prevalence.

If the subject comes from the general population then the

prevalence is 0.07.

If we selected the subject from the group involved in this study

(that is, selected person is one of 276 subjects), then the

“prevalence” refers to this particular group and istp+fn

total n= 233/276.

Exercise 2. Hypothyroidism.

Low values of a total thyroxine (T4) test can be indicative of

hypothyroidism (Goldstein and Mushlin, 1987). Hypothyroidism is

a condition in which the body lacks sufficient thyroid hormone.

Since the main purpose of thyroid hormone is to “run the body’s

metabolism”, it is understandable that people with this condition

will have symptoms associated with a slow metabolism. Over five

million Americans have this common medical condition.

A total of 195 patients, among which 59 have confirmed

hypothyroidism, have been tested for the level of T4. If the

patients with T4-level ≤ 5 are considered positive for

hypothyroidism, the following table is obtained:

T4 value Hypothyroid Euthyroid Total

Positive, T4 ≤ 5 35 5 40

Negative, T4 > 5 24 131 155

Total 59 136 195

However, if the thresholds for T4 are 6, 7, 8 and 9, the following tables are obtained.


Positive, T4 ≤ 6 39 10 49

Negative, T4 > 6 20 126 146

Total 59 136 195


Positive, T4 ≤ 7 46 29 75

Negative, T4 > 7 13 107 120

Total 59 136 195


Positive, T4 ≤ 8 51 61 112

Negative, T4 > 8 8 75 83

Total 59 136 195


Positive, T4 ≤ 9 57 96 153

Negative, T4 > 9 2 40 42

Total 59 136 195

There is a tradeoff between sensitivity and specificity. One can

improve the sensitivity by moving the threshold to a higher T4

value; that is, make the criterion for a positive test less strict. One

can improve the specificity by moving the threshold to a lower T4

value; that is, make the criterion for a positive test more strict.

(a) For the test that uses T4 = 7 as threshold, find sensitivity,

specificity, positive and negative predicted values, likelihood ratio,

and degree of agreement. You can use MATLAB program sesp.m.

(b) Using the given thresholds for the test to be positive, plot the

ROC curve. What threshold would you recommend? Explain your

choice.

(c) Find the area under the ROC curve. You can use MATLAB file

auc.m.

[a b c d e f h] = sesp( 35, 5, 24, 131);

% Se Sp Pre PPV NPV LRP Ag YI

% 0.5932 0.9632 0.3026 0.8750 0.8452 16.1356 0.8513 0.3935

[a b c d e f h] = sesp( 39, 10, 20, 126);


% 0.6610 0.9265 0.3026 0.7959 0.8630 8.9898 0.8462 0.4154

[a b c d e f h] = sesp(46, 29, 13, 107);


% 0.7797 0.7868 0.3026 0.6133 0.8917 3.6563 0.7846 0.4005

[a b c d e f h] = sesp(51, 61, 8, 75);


% 0.8644 0.5515 0.3026 0.4554 0.9036 1.9272 0.6462 0.2941

[a b c d e f h] = sesp(57, 96, 2, 40);


% 0.9661 0.2941 0.3026 0.3725 0.9524 1.3686 0.4974 0.1840

se = [0, 0.5932, 0.6610, 0.7797, 0.8644, 0.9661, 1];

csp = [0, 1 - 0.9632, 1 - 0.9265, 1 - 0.7868, 1-0.5515, 1-0.2941, 1];

figure(1)

plot(csp, se, ’r-’)

hold on

plot(csp, se, ’ro’)

plot([0 1],[0 1], ’r-’)

xlabel(’1 - specificity’)

ylabel(’sensitivity’)

a = auc(csp, se)

% a = 0.8527 (Grade of B).

Download - SENSITIVITY,SPECIFICITY,AND RELATIVES - …zoe.bme.gatech.edu/~bv20/isye3770/Bank/SeSp.pdfSe= 1−[(1−Se 1) ×(1−Se 2)× ···×(1−Sek)] Sp= Sp 1 ×Sp 2 × ···×Spk. The

Top Related