computer-aided detection improves detection of pulmonary nodules in chest radiographs beyond the...

10
ORIGINAL RESEARCH n THORACIC IMAGING 252 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014 Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images 1 Steven Schalekamp, MD Bram van Ginneken, PhD Emmeline Koedam, MD Miranda M. Snoeren, MD Audrey M. Tiehuis, MD, PhD Rianne Wittenberg, MD, PhD Nico Karssemeijer, PhD Cornelia M. Schaefer-Prokop, MD, PhD Purpose: To evaluate the added value of computer-aided detection (CAD) for lung nodules on chest radiographs when radiol- ogists have bone-suppressed images (BSIs) available. Materials and Methods: Written informed consent was waived by the institutional review board. Selection of study images and study setup was reviewed and approved by the institutional review boards. Three hundred posteroanterior (PA) and lateral chest radiographs (189 radiographs with negative findings and 111 radiographs with a solitary nodule) in 300 subjects were selected from image archives at four institutions. PA images were processed by using a commercially available CAD, and PA BSIs were generated. Five radiologists and three residents evaluated the radiographs with BSIs avail- able, first, without CAD and, second, after inspection of the CAD marks. Readers marked locations suspicious for a nodule and provided a confidence score for that loca- tion to be a nodule. Location-based receiver operating characteristic analysis was performed by using jackknife alternative free-response receiver operating characteristic analysis. Area under the curve (AUC) functioned as figure of merit, and P values were computed with the Dorfman- Berbaum-Metz method. Results: Average nodule size was 16.2 mm. Stand-alone CAD reached a sensitivity of 74% at 1.0 false-positive mark per image. Without CAD, average AUC for observers was 0.812. With CAD, performance significantly improved to an AUC of 0.841 (P = .0001). CAD detected 127 of 239 nodules that were missed after evaluation of the radiographs to- gether with BSIs pooled over all observers. Only 57 of these detections were eventually marked by the observers after review of CAD candidates. Conclusion: CAD improved radiologists’ performance for the detection of lung nodules on chest radiographs, even when base- line performance was optimized by providing lateral ra- diographs and BSIs. Still, most of the true-positive CAD candidates are dismissed by observers. q RSNA, 2014 1 From the Department of Radiology, Route 767, Radboud University Medical Center, Internal Postal Code 766, Post- bus 9101, 6500 HB Nijmegen, the Netherlands (S.S., B.v.G., E.K., M.M.S., N.K., C.M.S.); and Department of Radiology, Meander Medical Center, Amersfoort, the Netherlands (A.M.T., R.W., C.M.S.). Received June 7, 2013; revision requested August 7; final revision received October 25; accepted November 22; final version accepted January 9, 2014. Supported by a research grant from Riverain Tech- nologies, Miamisburg, Ohio. Address correspondence to S.S. (e-mail: [email protected]). q RSNA, 2014 Note: This copy is for your personal non-commercial use only. To order presentation-ready copies for distribution to your colleagues or clients, contact us at www.rsna.org/rsnarights.

Upload: cornelia-m

Post on 29-Mar-2017

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Orig

inal

res

earc

h n

Tho

raci

c im

agin

g

252 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014

computer-aided Detection improves Detection of Pulmonary nodules in chest radiographs beyond the support by Bone-suppressed images1

Steven Schalekamp, MDBram van Ginneken, PhDEmmeline Koedam, MDMiranda M. Snoeren, MDAudrey M. Tiehuis, MD, PhDRianne Wittenberg, MD, PhDNico Karssemeijer, PhDCornelia M. Schaefer-Prokop, MD, PhD

Purpose: To evaluate the added value of computer-aided detection (CAD) for lung nodules on chest radiographs when radiol-ogists have bone-suppressed images (BSIs) available.

Materials and Methods:

Written informed consent was waived by the institutional review board. Selection of study images and study setup was reviewed and approved by the institutional review boards. Three hundred posteroanterior (PA) and lateral chest radiographs (189 radiographs with negative findings and 111 radiographs with a solitary nodule) in 300 subjects were selected from image archives at four institutions. PA images were processed by using a commercially available CAD, and PA BSIs were generated. Five radiologists and three residents evaluated the radiographs with BSIs avail-able, first, without CAD and, second, after inspection of the CAD marks. Readers marked locations suspicious for a nodule and provided a confidence score for that loca-tion to be a nodule. Location-based receiver operating characteristic analysis was performed by using jackknife alternative free-response receiver operating characteristic analysis. Area under the curve (AUC) functioned as figure of merit, and P values were computed with the Dorfman-Berbaum-Metz method.

Results: Average nodule size was 16.2 mm. Stand-alone CAD reached a sensitivity of 74% at 1.0 false-positive mark per image. Without CAD, average AUC for observers was 0.812. With CAD, performance significantly improved to an AUC of 0.841 (P = .0001). CAD detected 127 of 239 nodules that were missed after evaluation of the radiographs to-gether with BSIs pooled over all observers. Only 57 of these detections were eventually marked by the observers after review of CAD candidates.

Conclusion: CAD improved radiologists’ performance for the detection of lung nodules on chest radiographs, even when base-line performance was optimized by providing lateral ra-diographs and BSIs. Still, most of the true-positive CAD candidates are dismissed by observers.

q RSNA, 2014

1 From the Department of Radiology, Route 767, Radboud University Medical Center, Internal Postal Code 766, Post-bus 9101, 6500 HB Nijmegen, the Netherlands (S.S., B.v.G., E.K., M.M.S., N.K., C.M.S.); and Department of Radiology, Meander Medical Center, Amersfoort, the Netherlands (A.M.T., R.W., C.M.S.). Received June 7, 2013; revision requested August 7; final revision received October 25; accepted November 22; final version accepted January 9, 2014. Supported by a research grant from Riverain Tech-nologies, Miamisburg, Ohio. Address correspondence to S.S. (e-mail: [email protected]).

q RSNA, 2014

Note: This copy is for your personal non-commercial use only. To order presentation-ready copies for distribution to your colleagues or clients, contact us at www.rsna.org/rsnarights.

Page 2: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Radiology: Volume 272: Number 1—July 2014 n radiology.rsna.org 253

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

archives of four institutions (three aca-demic and one nonacademic hospital), as follows: Radboud University Medical Center (Nijmegen, the Netherlands), University Medical Center (Utrecht, the Netherlands), Academic Medical Center (Amsterdam, the Netherlands), and Meander Medical Center (Amers-foort, the Netherlands). All images were derived from clinically indicated examinations.

Inclusion criteria were the presence of a solid solitary nodule (, 30 mm in diameter) and the availability of a PA and lateral chest radiograph and a chest CT scan obtained within 3 months. Radiographs showing signs of other disease, except for chronic obstructive pulmonary disease, were not included. All subjects had to be older than 40 years. For control patients, absence of disease was ascertained by a chest ra-diograph with negative findings and a CT scan with negative findings within 6 months of the chest radiograph.

To ensure a wide range of lesion conspicuities with a sufficient number of low-conspicuity lesions, visibility of the nodules was assessed by an expert radi-ologist and a clinical researcher (C.M.S. and S.S.) in consensus. Nodules had to be visible on the PA radiograph and, on

for computer-aided detection (CAD) have been developed. Since the first introduction of CAD, these programs have been continuously improved with respect to sensitivity and specificity. Researchers in observer studies so far have reported variable results. For in-stance, while investigators in one study (7) demonstrated an improved detec-tion of lung cancers from 68.2% to 76.7%, those in two other studies (8,9) did not find a significant improvement with CAD. As a possible underlying problem, the authors in those studies discussed that observers could not suf-ficiently discriminate between true-pos-itive (TP) and false-positive (FP) CAD candidate lesions: TP CAD candidate lesions were dismissed, while FP CAD candidate lesions were accepted by the readers (8,9). It is conceivable that combining both efforts, BSIs and CAD, leads to an improved detection perfor-mance for small and low-conspicuity nodules. The purpose of our study was to evaluate the added value of CAD for lung nodules in chest radiographs when radiologists have BSIs available.

Materials and Methods

Riverain Technologies (Miamisburg, Ohio) paid for this study, and one au-thor (C.M.S.) served on an advisory board for Riverain Technologies. Study design, data acquisition and analysis, as well as mansucript writing, were completely controlled by the authors without any influence by Riverain Technologies.

DataSelection of study images and study setup was reviewed and approved by the institutional review boards. Writ-ten informed consent was waived. We retrospectively selected 300 digital posteroanterior (PA) and lateral chest radiographs by reviewing the image

Chest radiography still represents the first-line method for diagno-sis of lung nodules regardless of

its known inferiority to computed to-mography (CT). In addition to the fact that a subset of nodules is therefore indeed not visible on a chest radio-graph, there is a subset of lesions that are well visible but are overlooked by the observer (1,2). The latter may be due to overprojection by other ana-tomic structures or may happen due to inattentional blindness by the ob-server, meaning that lesions are simply overlooked or no appropriate action is triggered once the lesion is noticed. Therefore, the goal of recent image processing techniques is to reduce the risk of overlooking nodular lesions that are well visible in retrospect. Accord-ing to the literature, the number of lung cancers initially missed on chest radiographs but retrospectively visible amounts to 19%–26% in various study populations (1,2). Overprojection by osseous structures has been reported as one reason leading to reduced de-tection of lung cancer (3,4). Bone-suppressed images (BSIs) address this issue and have been found to signifi-cantly improve observer performance for detection of pulmonary nodules, with an increase of sensitivity ranging from 4% to 17% (5,6).

To further reduce the risk of miss-ing a nodule that is visually discernible on the radiograph, several programs

Implication for Patient Care

n Combination of CAD and bone suppression in chest radiography improves detection of potentially early lung cancers.

Advances in Knowledge

n Computer-aided detection (CAD) improves observer performance for the detection of lung nodules on chest radiographs, beyond the application of bone suppression alone; CAD detected 53% (127 of 239) of the nodules that ini-tially were missed by the readers; readers accepted 45% (57 of 127) of these true-positive CAD candidates.

n The beneficial effect of CAD is limited by the insufficient ability of the observers to differentiate true-positive from false-positive CAD candidates.

Published online before print10.1148/radiol.14131315 Content code:

Radiology 2014; 272:252–261

Abbreviations:AFROC = alternative free-response receiver operating

characteristicAUC = area under the curveBSI = bone-suppressed imageCAD = computer-aided detectionFP = false-positivePA = posteroanteriorTP = true-positive

Author contributions:Guarantors of integrity of entire study, S.S., C.M.S.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manu-script revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; literature research, S.S., B.v.G., N.K., C.M.S.; clin-ical studies, S.S., B.v.G., E.K., A.M.T., R.W., N.K.; statistical analysis, S.S., A.M.T., N.K., C.M.S.; and manuscript editing, S.S., B.v.G., M.M.S., R.W., N.K., C.M.S.

Conflicts of interest are listed at the end of this article.

Page 3: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

254 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

available. Observers could review both PA and lateral images side by side. BSIs could be visualized with a key on the key-board and appeared at the exact same location as the PA radiograph. In this way, the reader could toggle between the original chest radiograph and the BSI to easily review corresponding areas in the radiograph. After the first scoring without CAD but with BSIs, CAD marks were automatically displayed and could also be toggled on and off, if desired, by the readers. Subsequently, the readers were asked to score location and degree of suspicion for the presence of a nodule for the second time: The readers were allowed to score new lesions, remove old lesions, or modulate previously de-termined scores after they had seen the CAD marks.

Radiologists had been given the in-formation that a maximum of one nod-ule was present in each examination. They knew the study set consisted of more control patients than patients with a nodule, but they did not know the exact numbers. Radiologists were assigned to read the chest radiographs in different random order, encouraging them to review chest radiographs with a reading speed they would have in clin-ical conditions.

StatisticsFor statistical analysis, multireader multiple-case jackknife alternative free-response receiver operating character-istic (AFROC) analysis was performed (11,12). A finding by an observer was considered a TP finding when the marking was within 1 cm of the center of the ground-truth annotation. As in-put for jackknife AFROC analysis, only one reader score per image is used for analysis. For cases with negative (nor-mal) findings, this is the FP finding with the highest score. For cases with pos-itive (abnormal) findings, markings of nonlesion locations are ignored, and only TP markings (if present) are used. In that way, we ensured that readers could not be rewarded for marking non-lesion locations in cases with positive findings. Area under the curve (AUC), which represents the probability that a lesion is rated higher than nonlesion

Reading MethodsFive radiologists, one of whom was an author (M.M.S., with 5 years of expe-rience) and four of whom were not au-thors (with 13, 3, 17, and 17 years of experience), and three residents (E.K., a 4th-year resident; R.W., a 2nd-year resident; and A.M.T., a 4th-year resi-dent) evaluated the radiographs in dif-ferent randomized orders. None of the readers had any previous experience with BSIs or CAD.

Readers reviewed the cases first without and subsequently with the use of CAD marks within one reading ses-sion. BSIs were available at all times. To familiarize the readers with the software used in the study, a training session of 40 cases, with instant feedback from the researcher, was provided in advance. The training set consisted of 22 patients with a nodule and 18 control patients without a nodule, which were not used in the study. Training images contained representative lesions and other abnor-malities (eg, old rib fractures) to famil-iarize the radiologists with the BSIs and the CAD output. The setup of the train-ing was similar to the review of study images, including the two-stage scoring, without and with CAD.

In the observer study, readers were able to mark and score suspicious re-gions in the PA chest radiograph. A ruler on the screen with a continuous scale between 0 and 100 was used to document the reader’s degree of sus-piciousness (confidence) that a nodule was present (0, not suspicious; 100, definitely suspicious). Observers were allowed to score multiple regions that were suspicious for a nodule per image. All readings with respect to localiza-tion and confidence scale were digitally documented. Readers did not have the ability to change previous annotations.

Readings took place at a 30-inch Digital Imaging and Communications in Medicine–calibrated liquid crystal dis-play monitor (FlexscanSX3031 W; Eizo, Ishikawa, Japan), with a native screen resolution of 2560 3 1600 in a darkened room, mimicking clinical reading condi-tions. Processing tools, such as adjust-ment of window and level, zoom in and zoom out, and gray-scale inversion, were

the basis of visual inspection, were clas-sified into four categories, ranging from well visible (category 1) to moderately subtle (category 2), subtle (category 3), and very subtle (category 4). Lesions were annotated on the PA chest radio-graph, after a coronal projection image of the CT scan was reviewed.

Nodule volume (in cubic millime-ters) was calculated from annotations made on CT scans (10). Subsequently, diameter of the nodule was calculated from the nodule volume, assuming each nodule to be a sphere. If known, path-ologic findings and follow-up data were derived from clinical records.

Image AcquisitionAll chest radiographs were obtained with a digital technique by using stor-age phosphor plates (CR; Agfa Health-Care, Mortsel, Belgium), a selenium drum (Thoravision; Philips Healthcare, Hamburg, Germany), and flat-panel–detector digital radiography systems (Siemens, Erlangen, Germany). Image postprocessing was applied as recom-mended by the manufacturer and used in clinical routine.

Image Processing SoftwareComputer-detection output was gener-ated by a commercially available CAD system (ClearRead +Detect 5.2; Riv-erain Technologies). The CAD system is optimized for detection of nodules between 9 and 30 mm in diameter, al-though larger and smaller nodules are also marked if they are localized with the software. Candidate lesions identi-fied with the software were marked by circles. CAD marks could be displayed on both the original radiograph and the PA BSIs. Only PA radiographs are pro-cessed with the system.

BSIs were computed by using soft-ware (ClearRead Bone Suppression 2.4; Riverain Technologies). The soft-ware generates PA radiographs identi-cal to the original image with respect to size and gradation characteristics, with the difference that overprojections of ribs and clavicle were digitally removed. Both software products are commer-cially available and have U.S. Food and Drug Administration approval.

Page 4: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Radiology: Volume 272: Number 1—July 2014 n radiology.rsna.org 255

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

as determined on the CT scan was 16.2 mm (median, 15.1 mm; range, 7.8–35 mm). When lesions were measured on the CT scans, five lesions exceeded 30 mm in diameter, herewith represent-ing a mass and not a nodule according to the Fleischner glossary (13). The conspicuity of these five lesions was categorized as being obvious (n = 1), moderately subtle (n = 3), and subtle (n = 1). Seventy-eight lesions were malig-nant, and malignancy was histologically proved in 67 cases and was based on clinical history in 11 cases. Twenty nod-ules were benign, and the pathologic findings in 13 nodules were unknown.

Stand-alone CADCAD reached a stand-alone sensitivity of 74% (82 of 111) at 1.0 FP mark per image (range, 0–5 marks per image). Sensitivity of CAD was 91% (29 of 32) for well-visible nodules and 88% (28 of 32) for moderately subtle nodules. Sensitivities for subtle and very subtle nodules were 62% (18 of 29) and 39% (seven of 18), respectively. Ninety-one percent (21 of 23) of the nodules larger than 20 mm in diameter were detected by CAD. For smaller nodules, detection rates were lower, with a detection rate of 62% (20 of 32) for nodules between 15 and 20 mm in diameter and a detec-tion rate of 77% (36 of 47) for nodules

test with respect to sex distribution and an unpaired t test with respect to age distribution. A significant difference was defined at P , .05.

Results

Patient CharacteristicsThree hundred subjects were selected for the study, and images from these subjects included 189 radiographs with negative findings and 111 radiographs with a solitary pulmonary nodule. Av-erage age was 65 years (range, 44–88 years) for patients with a nodule and 64 years (range, 41–88 years) for con-trol subjects. Differences in average age were not significant (P = .22). One hun-dred seventy-seven subjects were male (average age, 64 years; range, 44–87 years), and 123 were female (average age, 64 years; range, 41–88 years). Dif-ference in age between male and female subjects was not significant (P = .76). There was an even age distribution over the groups (nodule group: 66 male sub-jects [age, 66 years; range, 44–83 years] and 45 female subjects [age, 63 years; range, 66–88 years] vs control group: 111 male subjects [age, 63 years; range, 44–87 years] and 78 female subjects [age, 64 years; range, 41–88 years], with P = .9). Average nodule diameter

locations in images with negative find-ings, was calculated by using the trap-ezoidal integration method, also known as the Wilcoxon rank-sum test. AUCs without and with help of CAD were compared with the Dorfman-Berbaum-Metz method (DBMMRMC package, version 2.33, Medical Image Perception Laboratory, Department of Radiology, Carver College of Medicine, University of Iowa, Iowa City, Iowa; http://percep-tion.radiology.uiowa.edu), which in-cludes reader, case, and treatment var-iance. Also partial AUC, corresponding to a specificity range of 80%–100%, was calculated from the AFROC curves.

Further, sensitivity was calculated by dividing the number of correctly localized lesions by the total number of lesions. Specificity was calculated by dividing the number of nonmarked cases by the total number of cases with negative findings. These calculations were performed, taking all scores of suspiciousness into account, and were performed for all lesions, as well as for the subcategories of various conspicu-ities. Sensitivities and specificities were also calculated for different thresholds of suspiciousness scores given by the observers to investigate the shift in sensitivity and specificity with use of CAD. Differences in patient charac-teristics were compared by using a x2

Table 1

Total Number of Nodules Missed by CAD and Observers and Dismissed TP CAD Candidates by All Eight Observers Together

CharacteristicNo. of Nodules Missed by CAD

No. of Nodules Missed Pooled over All Readers

No. of Nodules Missed by Readers but Detected by CAD

Dismissed TP Candidates Pooled over All Readers

Conspicuity Well visible (n = 32) 3/32 (9) 9/256 (4) 9/9 (100) 3/9 (33) Moderately visible (n = 32) 4/32 (12) 36/256 (14) 24/36 (67) 10/24 (42) Subtle (n = 29) 11/29 (38) 82/232 (35) 51/82 (62) 26/51 (51) Very subtle (n = 18) 11/18 (61) 112/144 (78) 43/112 (38) 31/43 (72) Overall 29/111 (26) 239/888 (27) 127/239 (53) 70/127 (55)Diameter (mm) .25 (n = 7) 0/7 (0) 11/56 (20) 11/11 (100) 5/11 (45) 20–25 (n = 16) 2/16 (12) 26/128 (20) 18/26 (69) 8/18 (44) 15–20 (n = 32) 12/32 (38) 77/256 (30) 29/77 (38) 13/29 (45) 10–15 (n = 47) 11/47 (23) 102/376 (27) 56/102 (55) 36/56 (64) ,10 (n = 9) 4/9 (44) 23/72 (32) 13/23 (57) 8/13 (62) Overall 29/111 (26) 239/888 (27) 127/239 (53) 70/127 (55)

Note.—Numbers in parentheses are percentages.

Page 5: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

256 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

CAD detected 53% (127 of 239) of the nodules that were missed by the readers. Readers dismissed 55% (70 of 127) of these TP CAD candidates. If CAD led to modification of reader scores, it was beneficial in most cases (eg, leading in cases with a nodule to placement of new labels in 57 occa-sions and an increase of suspiciousness score in 220 occasions). These positive

Comparing sensitivity and specific-ity without and with CAD at different thresholds of suspicion, the largest in-crease of sensitivity was seen at high confidence scores above 90 (Table 3). This was associated with the smallest loss in specificity. Going to lower thresh-olds, increase of sensitivity became smaller and loss of specificity became larger.

between 10 and 15 mm in diameter. The CAD system localized 56% (five of nine) of the nodules smaller than 10 mm in diameter. On this data set, CAD reached an area under the AFROC curve of 0.656. Pooled over all readers, the readers missed 239 nodules without CAD. CAD detected 53% (127 of 239) of these nodules (Table 1).

In total, CAD generated 196 FP de-tections in cases with negative finding (n = 189). Most FP findings of the CAD system were provoked by the anterior contour of the first rib (n = 29) or by hilar vascular shadows (n = 69) (Fig 1). Seventy-one (38%) of the cases without a nodule (normal cases) and 10 (9%) of the cases with a nodule did not contain any CAD mark.

Observer PerformanceTotal area under the AFROC curve for the observers was 0.812 without CAD versus 0.841 with CAD (P = .0001) (Fig 2). Observers improved their perfor-mance at a specificity that ranged be-tween 80% and 100%, shown by an increase of the partial AUC from 0.125 to 0.133. All eight observers improved their performance with use of CAD (Figs 3, 4 ); individually, the differences reached significance for five observers (Table 2).

Figure 1

Figure 1: CAD performance: Left: Radiograph demonstrates approximate locations of all the 111 nodules that were detected with CAD (white stars) and not detected with CAD (black dots). Right: Same reference radiograph with the approximate locations of the FP marks (white dots) (n = 196) made by the CAD system in all cases without a nodule.

Figure 2

Figure 2: Graph shows average AFROC curve without and with CAD.

Page 6: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Radiology: Volume 272: Number 1—July 2014 n radiology.rsna.org 257

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

Mean scores of suspicion for the pres-ence of a lesion were 80, 66, 46, and 12 (on a scale from 0 to 100) for the four subsets of nodules with decreasing conspicuity. This factor indicates that

Lesion ConspicuityConsidering the markings by all eight observers, they located 649 of 888 nod-ules (eight readers 3 111 nodules) with-out CAD and 704 nodules with CAD.

effects counteracted the negative effects caused by placing new labels (n = 92) or in an increase of suspiciousness (n = 66) in cases without a nodule (normal cases) (Table 4).

Figure 3

Figure 3: A 12-mm adenocarcinoma in the left upper lobe in a 62-year-old man. The lesion was categorized as subtle (category 3). Without CAD, only one observer marked the tumors. With CAD, the tumor was correctly identified, and this action triggered marking of the cancer by four other observers, as well. Right hilar vascular shadow caused an FP marking of the CAD system. (a) Original radiograph. (b) BSI. (c) Radio-graph with CAD marks (circles). (d) Cross section of coronal CT scan.

Page 7: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

258 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

256), 65% (150 of 232), and 22% (32 of 144) for subtlety categories 1–4, respec-tively. With CAD, sensitivities increased

For the four subcategories of conspi-cuities, readers reached a mean sensi-tivity of 96% (247 of 256), 86% (220 of

subjective categorization of conspicuity well correlated with the degree of ob-servers’ suspicion.

Figure 4

Figure 4: A 23-mm non–small-cell lung cancer in the apex of the right lung in a 65-year-old woman. The lesion was categorized as moder-ately subtle (category 2). None of the observers marked the lesion on the basis of the standard chest radiograph and the BSI. After CAD review, four observers marked the tumor. (a) Original radiograph. (b) BSI. (c) Radiograph with CAD marks (circle). (d) Cross section of coronal CT scan.

Page 8: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Radiology: Volume 272: Number 1—July 2014 n radiology.rsna.org 259

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

to 52 annotations with CAD (P = .001). However, when considering only cases with an FP score exceeding a degree of suspicion of 50 (on a scale of 0–100), observers made, on average, 15 FP an-notations without CAD and 17 FP an-notations with CAD (P = .31).

Discussion

The results of this study show that CAD provides additional value beyond previously documented beneficial effect of BSIs alone. Several previous studies had homogeneously documented a sig-nificant improvement of the detection of lung nodules by using BSIs (5,6,14). The results of this study also demon-strate that this CAD system provides further support for the detection of lung nodules when the initial image analysis was performed with BSIs. This point is noteworthy because previous studies in which researchers evaluated CAD alone had found variable results. Some found an increase in accuracy for the detec-tion of lung nodules (7,15,16). Others found an increase in sensitivity but also a decrease in specificity, resulting in a negligible effect of CAD (8,9,17,18).

Investigators in three of the previ-ously published CAD observer studies have evaluated a previous version of the same software we used in this study; in one study, they found a positive effect of CAD (16), and in the other two, they could not prove significant improvement by using CAD (9,18). In this study, we used an upgraded version of the CAD algorithm with a lower FP rate. The sig-nificant improvement with CAD that we found now, therefore, may be attribut-able to both a further improvement of the CAD algorithm and the availability of BSIs. On the other hand, it is re-markable that CAD was able to help further increase performance from an already high baseline performance with use of BSIs. The fact that, with CAD, we found nodules that were missed by all readers (three of seven lesions) or by most of the readers (12 of 25 le-sions), even with study conditions in which readers were especially vigilant to detect nodules, underscores the po-tential of the CAD algorithm.

by readers but seen by CAD and the number of TP CAD candidates dis-missed by the observers were highest within the group of low-conspicuity le-sions (Table 1).

FP AnnotationsTaking FP annotations of any suspicion (on a scale of 0–100) into account, readers made, on average, 43 FP an-notations in cases without a nodule without CAD. This number increased

to 98% (252 of 256), 92% (235 of 256), 75% (175 of 232), and 29% (42 of 144), respectively. AFROC analysis showed that improvement for moderately subtle and subtle nodules (categories 2 and 3, respectively) reached significance, with P = .003 and P = .01, respectively.

Most of the lesions that were missed by observers and CAD were in the low-conspicuity groups (categories 3 and 4). At the same time, both the relative number of lesions that had been missed

Table 2

Observer Performance

Observer*

Without CAD With CAD

P ValueAUC Partial AUC AUC Partial AUC

Radiologists Observer A, 13 years 0.812 0.125 0.840 0.135 .05 Observer B, 3 years 0.820 0.126 0.853 0.143 .01 Observer C, 5 years 0.857 0.143 0.879 0.148 .03 Observer D, 17 years 0.748 0.110 0.792 0.119 .01 Observer F, 17 years 0.830 0.117 0.848 0.127 .21 Average 0.814 0.124 0.842 0.134 .0005Residents Observer E, 4 years 0.780 0.120 0.817 0.123 ,.05 Observer G, 2 years 0.792 0.119 0.831 0.127 .005 Observer H, 4 years 0.856 0.140 0.868 0.143 .21 Average 0.809 0.126 0.839 0.131 ,.05All 0.812 0.125 0.841 0.133 .0001

Note.—The AUC and partial AUC were at a high specificity range between 80% and 100% without and with CAD. P values of the differences without and with CAD were computed with the Dorfman-Berbaum-Metz method for the whole AUC.

* Number of years is the number of years of experience or the years of residency.

Table 3

Average Sensitivities and Specificities at Various Decision Thresholds of the Confidence Scores of Observer Markings

Decision Threshold*

Sensitivity (%) Specificity (%)

Without CAD With CAD Difference Without CAD With CAD Difference

All markings 73.1 79.3 6.2 77.1 72.4 4.7Observer markings .30 68.4 73.6 5.2 86.2 84.6 1.6 .50 61.5 67.2 5.7 92.0 91.2 0.8 .70 53.7 59.6 5.9 96.3 95.8 0.5 .80 41.8 48.4 6.6 97.6 97.2 0.4 .90 23.6 32.4 8.8 99.0 98.9 0.1

* The decision threshold is for confidence scores of observer markings. For instance, a confidence level of greater than 30 for all markings means that only observer markings exceeding a confidence score of 30 are being considered in the calculation of sensitivity and specificity. In general, at a high confidence level (.90), sensitivity was lowest but specificity highest. Availability of CAD achieved in this category the highest relative gain in sensitivity without loss of specificity.

Page 9: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

260 radiology.rsna.org n Radiology: Volume 272: Number 1—July 2014

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

apparently very difficult for human ob-servers. Availability of BSIs might have been helpful for differentiating TP from FP candidates, but it did not help to minimize or even eliminate the fact that TP candidates were not accepted by the readers. We, therefore, plan to explore other means to use CAD (eg, by adding weighting factors or likelihood factors to strengthen the influence of CAD on the reader decision).

Our study had some limitations. The study group had a much higher preva-lence of subjects with disease (subjects with a solitary nodule) than that encoun-tered in clinical practice. Furthermore, study conditions in which readers are focused on a specific detection task do not reflect the clinical situation.

The effect of BSIs on evaluating im-ages with other than nodular disease is yet unknown. It could be stated that over-sight represents a bigger issue in clin-ical practice than do study conditions, where observers are explicitly focusing on the search for nodules. With respect to failure of detection due to oversight, it is therefore likely that beneficial ef-fects of BSIs and CAD become more prominent in clinical practice. Finally, none of the readers had experience with CAD in chest radiography. Although we provided a training set of 40 cases with instant feedback, this might have been insufficient to help the readers become familiar with BSIs and CAD.

We demonstrated that CAD has an additional beneficial effect on the detec-tion of pulmonary nodules beyond the effect of BSIs alone. Even though base-line performance was optimized by the availability of BSIs, radiologists were able to uniformly improve their detec-tion performance.

Acknowledgments: The authors acknowl-edge L. Meiss, MD (Meander Medical Center, Amersfoort, the Netherlands), M. Imhof-Tas, MD, and L. Peters-Bax, MD, PhD (both from Radboud University Medical Center, Nijmegen, the Netherlands), and L. G. B. A. Quekel, MD, PhD (Meander Medical Center, Amersfoort, the Netherlands), for their participation in the ob-server study.

Disclosures of Conflicts of Interest: S.S. Fi-nancial activities related to the present article: participated in a research project funded with a research grant from Riverain Technologies. Fi-nancial activities not related to the present arti-

Table 4

Number of Occasions, Pooled of All Observers, in Which Observers Defined New Markings, Removed Markings, and Decreased or Increased the Score of Suspiciousness

DataNo. of Occasions in Cases without a Nodule

No. of Occasions in Cases with a Nodule

New markings 92 57Removed markings 20 3Increase in score 66 220Decrease in score 125 29

Note.—Occasions are listed separately for cases with normal findings (n = 8 3 189 cases = 1512) and cases with a nodule (8 3 111 cases = 888).

Our results exceed the findings re-cently reported in a study in which the effect of BSIs and that of dual energy, both secondarily aided by CAD, were compared (19). In that study, the au-thors found a further increase of sen-sitivity when they added CAD as an additional tool; however, no significant increase of figure of merit was seen, indicating that the increased sensitiv-ity was nullified by a loss of specificity. Other methodological differences be-tween that study and our study refer to reading methods, statistical evaluation, and version of CAD software.

The improvement of reader perfor-mance with CAD that we found was sig-nificant for the full range of specificities (0%–100%) but also yielded uniform improvement while preserving a high specificity, as indicated by the pAUC at specificities between 80% and 100%. It has to be noted that it is ruled out by our statistical analysis that readers could improve their performance by being able to mark several locations, although they knew that the radiographs contained only one nodule: A high number of cases with normal findings with FP marks in-evitably would have led to a performance decrease by using AFROC analysis. Also, AFROC analysis correlated lesion location with locations of reader marks. We offered the option to mark several locations to encourage the readers to scrutinize the image for potential lesions completely and minimize the effects of “satisfaction of search.”

With respect to lesion conspicuity, we found the largest effect of CAD for

moderately subtle and subtle lesions. Well-visible and very subtle nodules were relatively less affected by the avail-ability of CAD. These findings can be explained by the fact that the sensitivity of the CAD system for the moderately subtle and the subtle nodules was quite high (ie, 88% and 62%, respectively), providing sufficient chances for further reader improvement. For well-visible nodules, reader performance without CAD and CAD stand-alone perfor-mance were both high. For the category of very subtle lesions, the sensitivity of CAD was much better than that of the readers (39% vs 22%). Therefore, readers could have taken advantage of the availability of CAD similarly, as they did for the moderately subtle and sub-tle lesions. The fact that readers were not able to use CAD more beneficially appears to confirm the previously dis-cussed issue that readers find it difficult to differentiate TP from FP lesions, es-pecially for very subtle lesions.

In total, readers dismissed a sub-stantial number of TP CAD candidates (n = 70) (Table 1), accounting for 55% of the TP CAD candidates (n = 127) made on lesions missed at baseline (n = 239). This percentage is similar to previously published figures ranging from 55% to 80% (9,16,20). These high percentages of dismissed TP CAD candidates show potential of further improvement with use of CAD. Al-though the majority of FP CAD marks may be easily dismissed, discrimination between a TP CAD mark for a low-con-spicuity lesion and an FP CAD mark is

Page 10: Computer-aided Detection Improves Detection of Pulmonary Nodules in Chest Radiographs beyond the Support by Bone-suppressed Images

Radiology: Volume 272: Number 1—July 2014 n radiology.rsna.org 261

THORACIC IMAGING: Computer-aided Detection Improves Pulmonary Nodule Detection Schalekamp et al

cle: none to disclose. Other relationships: none to disclose. B.v.G. Financial activities related to the present article: none to disclose. Financial activities not related to the present article: in-stitution received grants or has grants pending from Science Foundation Netherlands. Other relationships: none to disclose. E.K. No relevant conflicts of interest to disclose. M.M.S. No rele-vant conflicts of interest to disclose. A.M.T. No relevant conflicts of interest to disclose. R.W. No relevant conflicts of interest to disclose. N.K. Fi-nancial activities related to the present article: participated in a research project funded with a research grant from Riverain Technologies. Fi-nancial activities not related to the present arti-cle: none to disclose. Other relationships: none to disclose. C.M.S. Financial activities related to the present article: served on the advisory board (with payment in 2012 but no payment in 2013) of Riverain Technologies. Financial activities not related to the present article: none to disclose. Other relationships: none to disclose.

References 1. Austin JH, Romney BM, Goldsmith LS.

Missed bronchogenic carcinoma: radio-graphic findings in 27 patients with a poten-tially resectable lesion evident in retrospect. Radiology 1992;182(1):115–122.

2. Quekel LG, Kessels AG, Goei R, van En-gelshoven JM. Miss rate of lung cancer on the chest radiograph in clinical practice. Chest 1999;115(3):720–724.

3. Monnier-Cholley L, Arrivé L, Porcel A, et al. Characteristics of missed lung cancer on chest radiographs: a French experience. Eur Radiol 2001;11(4):597–605.

4. Shah PK, Austin JHM, White CS, et al. Missed non-small cell lung cancer: radiographic find-ings of potentially resectable lesions evident only in retrospect. Radiology 2003;226(1): 235–241.

5. Freedman MT, Lo SCB, Seibel JC, Bromley CM. Lung nodules: improved detection with

software that suppresses the rib and clavicle on chest radiographs. Radiology 2011;260(1): 265–273.

6. Li F, Engelmann R, Pesce LL, Doi K, Metz CE, Macmahon H. Small lung cancers: im-proved detection by use of bone suppression imaging—comparison with dual-energy sub-traction chest radiography. Radiology 2011; 261(3):937–949.

7. Xu Y, Ma D, He W. Assessing the use of digital radiography and a real-time interac-tive pulmonary nodule analysis system for large population lung cancer screening. Eur J Radiol 2012;81(4):e451–e456.

8. De Boo DW, Uffmann M, Weber M, et al. Computer-aided detection of small pulmonary nodules in chest radiographs: an observer study. Acad Radiol 2011;18(12):1507–1514.

9. de Hoop B, De Boo DW, Gietema HA, et al. Computer-aided detection of lung cancer on chest radiographs: effect on observer per-formance. Radiology 2010;257(2):532–540.

10. Kuhnigk JM, Dicken V, Bornemann L, et al. Morphological segmentation and par-tial volume analysis for volumetry of solid pulmonary lesions in thoracic CT scans. IEEE Trans Med Imaging 2006;25(4):417–434.

11. Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol 1992;27(9):723–731.

12. Roe CA, Metz CE. Variance-component modeling in the analysis of receiver oper-ating characteristic index estimates. Acad Radiol 1997;4(8):587–600.

13. Austin JH, Müller NL, Friedman PJ, et al. Glossary of terms for CT of the lungs: recom-mendations of the Nomenclature Committee of the Fleischner Society. Radiology 1996; 200(2):327–331.

14. Li F, Hara T, Shiraishi J, Engelmann R, MacMahon H, Doi K. Improved detection of subtle lung nodules by use of chest ra-diographs with bone suppression imaging: receiver operating characteristic analysis with and without localization. AJR Am J Roentgenol 2011;196(5):W535–W541.

15. van Beek EJR, Mullan B, Thompson B. Eval-uation of a real-time interactive pulmonary nodule analysis system on chest digital ra-diographic images: a prospective study. Acad Radiol 2008;15(5):571–575.

16. Kligerman S, Cai L, White CS. The effect of computer-aided detection on radiologist performance in the detection of lung cancers previously missed on a chest radiograph. J Thorac Imaging 2013;28(4):244–252.

17. Bley TA, Baumann T, Saueressig U, et al. Comparison of radiologist and CAD perfor-mance in the detection of CT-confirmed sub-tle pulmonary nodules on digital chest radio-graphs. Invest Radiol 2008;43(6):343–348.

18. Meziane M, Obuchowski NA, Lababede O, Lieber ML, Philips M, Mazzone P. A compar-ison of follow-up recommendations by chest radiologists, general radiologists, and pulmo-nologists using computer-aided detection to assess radiographs for actionable pulmonary nodules. AJR Am J Roentgenol 2011;196(5): W542–W549.

19. Szucs-Farkas Z, Schick A, Cullmann JL, et al. Comparison of dual-energy sub-traction and electronic bone suppression combined with computer-aided detection on chest radiographs: effect on human ob-servers’ performance in nodule detection. AJR Am J Roentgenol 2013;200(5):1006– 1013.

20. Lee KH, Goo JM, Park CM, Lee HJ, Jin KN. Computer-aided detection of malignant lung nodules on chest radiographs: effect on ob-servers’ performance. Korean J Radiol 2012; 13(5):564–571.