computerized detection of pulmonary nodules in chest radiographs based on morphological features and...

17

Click here to load reader

Upload: bilgin-keserci

Post on 05-Jul-2016

220 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

Medical Image Analysis 6 (2002) 431–447www.elsevier.com/ locate/media

C omputerized detection of pulmonary nodules in chest radiographsbased on morphological features and wavelet snake model

a ,1 b ,*Bilgin Keserci , Hiroyuki YoshidaaOsaka University Medical School, Division of Functional Diagnostic Imaging, 2-2 Yamadaoka, Suita City, Osaka 565-0871, Japan

bDepartment of Radiology, University of Chicago, 5841 South Maryland Avenue, MC2026, Chicago, IL 60637,USA

Received 20 December 1999; received in revised form 30 July 2001; accepted 14 February 2002

Abstract

We have developed a new computer-aided diagnosis scheme for automated detection of lung nodules in digital chest radiographs basedon a combination of morphological features and the wavelet snake. In our scheme, two processes were applied in parallel to reduce thefalse-positive detections after initial nodule candidates were selected. One process consisted of adaptive filtering for enhancement ofnodules and suppression of normal lung structures, followed by extraction of conventional morphological features. The other processconsisted of a novel approach for elimination of false positives called the edge-guided wavelet snake model. In the latter process,multiscale edges of the candidate nodules were extracted to yield parts of the nodule boundaries. A wavelet snake was then used for fittingof these multiscale edges for approximation of the true boundaries of nodules. A boundary feature called the weighted overlap betweenthe snake and the multiscale edges was calculated and used for elimination of false positives. Finally, the weighted overlap and themorphological features were combined by use of an artificial neural network for efficient reduction of false positives. Our scheme wasapplied to a publicly available database of digital chest images for pulmonary nodules. Receiver operating characteristic analysis wasemployed for evaluation of the performance of each process in the scheme. The combined features yielded a large reduction of falsepositives, and thus achieved a high performance in discriminating between true and false positives. These results show that our newmethod, in particular the false-positive reduction method based on the wavelet snake, is effective in improving the performance of acomputerized scheme for detection of pulmonary nodules in chest radiographs. 2002 Elsevier Science B.V. All rights reserved.

Keywords: Pulmonary nodule; Computer-aided diagnosis; Wavelet transform; Snake; Artificial neural network

1 . Introduction survival rate for lung cancer patients is only 14% (ACS,1999).

Lung cancer is the leading cause of cancer deaths among Early detection and treatment of lung cancers aremen and women in the United States. An estimated important because the survival rate can be increased to171 600 new cases were discovered in the United States in 50% if the tumor is detected at an early stage. In particular,1999, which accounted for 14% of cancer diagnoses, and the accurate detection and diagnosis of solitary, cir-an estimated 158 900 deaths were caused, accounting for cumscribed lung nodules is of great importance because28% of all cancer deaths (ACS, 1999). The overall 5-year many of these lesions are American Joint Committee stage

I lung cancers. Although computed tomography is general-ly considered the most effective imaging modality fordetection of pulmonary nodules, chest radiography remains

*Corresponding author. the initial procedure for the detection task. RadiologistsE-mail address: [email protected](H. Yoshida).1 make their diagnosis from radiographs based on thePresent address: General Electric Yokogawa Medical Systems, Insti-

perception of abnormalities and the subsequent decision-tute of Biomedical Research and Innovation, Department of Image-BasedMedicine, 2-2 Minatojima-Minamimachi, Chuo, Kobe 650-0047, Japan. making. However, studies show that radiologists detect

1361-8415/02/$ – see front matter 2002 Elsevier Science B.V. All rights reserved.PI I : S1361-8415( 02 )00064-6

Page 2: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

432 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

Fig. 1. Major steps in our computer-aided diagnosis scheme for the automated detection of lung nodules in chest images.

pulmonary nodules in radiographs in about 70–80% of Features extracted from the above two parallel processesactually positive cases (Forrest and Friedman, 1981; are combined by an artificial neural network (ANN) forNaidich et al., 1984). Muhm et al. (1983) reported that efficient reduction of false positives, and thus for yielding90% of missed peripheral lung cancers were detected a high performance in the detection of pulmonary nodules.retrospectively on previous films. The reasons for mis- The performance of each feature as well as combineddiagnosis may be differences in decision techniques, lack features was evaluated by receiver operating characteristicof experience, lack of clinical data, and structured film (ROC) analysis.noise (Revesz et al., 1974; Fontana and Sanderson, 1975; The wavelet snake is a deformable contour that uses theKundel and Revesz, 1976; Kundel and Nodine, 1980). wavelet transform to deform its shape. The wavelet snake

A computerized approach is attractive because it has the model is unique in the sense that it employs a prioripotential to provide objective and consistent results. Radi- knowledge about the difference between the nodules andologists’ accuracy in detecting pulmonary nodules can be normal structures such as ribs when it extracts a feature.improved by providing them with the ‘second opinion’ Nodules in chest radiographs look like round objects thatoffered by a computer-aided diagnosis (CAD) scheme, the are often overlapped with elongated normal structures suchmain function of which is to analyze chest radiographs and as vessels and ribs (Fig. 2). Therefore, the wavelet snakeindicate the potential locations of nodules in the images. models the nodule as a closed-contour object with aThe feasibility of CAD was demonstrated by Kobayashi et smooth boundary, whereas it assumes that the ribs areal. (1996) and MacMahon et al. (1999), who showed that a open-contour, band-like objects. Combined with multiscaleCAD scheme could significantly improve radiologists’ edges that extract portions of the boundary of a nodule,diagnostic accuracy in the detection of pulmonary nodules this model allows the wavelet snake to distinguish nodulesin chest radiographs. from false positives effectively. The multiscale nature of

Based on this expectation, various investigators have the wavelet basis is suitable for capturing the differencereported computerized schemes for detection of pulmonary between nodules and other structures, which is difficult tonodules in digital chest radiographs (see Section 1.1). perform by use of the morphological features alone.However, none of these schemes has been applied success-fully to clinical trials, because most of these methods 1 .1. Related worksuffer from a large number of false positives. In order toachieve a clinically acceptable high performance, there- Various computerized schemes have been proposed forfore, we developed a new CAD scheme for detection of detection of pulmonary nodules in digital chest radiog-pulmonary nodules that focuses on the reduction of false- raphs. In their pioneering work, Toriwaki et al. (1973)positive detections. used an edge detection approach with a linear filter and

The new CAD scheme is based on a combination of (1) thresholding to locate suspicious regions in the chest,extraction of morphological features from the candidate which consisted of connected pixels above a predeterminednodules, which is a conventional approach for discriminat- threshold. Then they used a series of tests for size,ing between nodules and false positives (Xu et al., 1997), location, and density variation to examine those regions.and (2) a feature obtained by a novel approach called an Lampeter and Wandtke (1986) manually masked theedge-guided wavelet snake model (Yoshida et al., 1997; outside of the lung regions in chest radiographs, followedKeserci, 1999). Fig. 1 is a diagram of the CAD scheme. by spline filtering for enhancement of the nodules. The

Page 3: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 433

performance of the initial nodule candidate selectionprocess, Yoshida et al. (1995) used the wavelet transformfor detection of subtle nodules that were missed when thedifference-image technique was used. In that study, theyemployed the partial wavelet reconstruction method as areplacement for the difference-image technique in the firststep of the above CAD scheme for enhancement ofpulmonary nodules while suppressing normal anatomicstructures. In this approach, digitized chest images werefirst decomposed by the wavelet transform and thenreconstructed from several different scale components. Thesame image feature analysis steps subsequent to thedifference-image technique were applied to the recon-structed images for detection of pulmonary nodules. Thecomputer output from this scheme was different from thatof the difference-image-based scheme, because the scale inthe partial reconstruction method was selected to enhancesome subtle nodules that could not be enhanced by thedifference-image technique. By combining the wavelettransform and the difference-image technique, they in-creased the sensitivity to 86% while maintaining the samelevel of specificity.Fig. 2. Example of a chest radiograph in which nodule location is

A comprehensive computer-aided diagnosis schemeindicated by an arrow.

based on the above three-phase approach was developedby Xu et al. (1997). The difference-image technique was

nodules were then detected by use of a Hough transform used in the first step for the enhancement of nodule-likefor circles. patterns while reducing the complex anatomic background

The early work on computer-aided diagnosis schemes structures. After the difference image was obtained, multi-for detection of nodules depended mainly on local edge ple gray-level thresholding based on a histogram of thedetection and contrast enhancement. No attempt was made difference image was performed for identifying initialto suppress normal background structures. Giger et al. nodule candidates. The nodule candidates were then classi-(1988, 1990) developed adifference-image technique to fied into six groups according to the threshold levels (inreduce the complex anatomic background structures while terms of the upper percentage area under the histogram ofenhancing the nodule-like patterns. This was accom- the difference image) at which the nodule candidates wereplished, from a single digital chest image, by producing identified. For separation of nodules from false positives,two filtered images: one in which the signal of the nodules various image features, most of which are morphologicalwas enhanced, and the other in which the signal of a features, were extracted from both the difference imagenodule was suppressed. The difference between these two and the original image with use of a region-growingprocessed images yielded an image with the signal of the technique and edge-gradient analysis. The image featuresnodule superimposed on a simplified background. The derived from the region-growing technique included con-resulting difference image was processed by gray-level trast, effective diameter, degree of circularity, degree ofthresholding based on a histogram of the difference image irregularity, and the change rates of the effective diameter,for obtaining nodule candidates. For each nodule candi- circularity, and irregularity as the gray-level thresholdingdate, a region-growing technique was applied for ex- was varied. Edge-gradient analysis was applied to a regiontracting various morphological features on nodules such as of interest centered at a candidate location of the edge-circularity, irregularity, and effective diameter. These gradient orientation. Finally, a rule-based analysis wasfeatures were then subjected to rule-based tests for de- applied to candidates in each group, with cutoff valuestection of nodules. Using the difference-image and feature (rules) determined manually for individual image featuresextraction techniques, they achieved a sensitivity of 70% in each group. Because the rules varied according to thewith an average of 7 or 8 false-positive detections per group to which a nodule candidate belonged, these rule-chest image. based tests were calledadaptive rule-based tests. The

Recent computerized schemes employ a three-phase computer-aided diagnosis scheme achieved a sensitivity ofapproach, i.e. initial nodule candidate selection, extraction 70% and a false-positive rate of approximately two perof features from these candidates, and discrimination of image.nodules from false positives based on these features, as Unfortunately, schemes proposed in the past sufferedformulated by Giger et al. (1988, 1990). To improve the from a large number of false positives. In an effort to

Page 4: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

434 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

reduce the number of false positives, various approaches nique. In the second phase, they used another ANN tohave been investigated. Matsumoto et al. (1992) applied reduce false positives. The input to the second ANNlinear filtering techniques to produce signal-enhanced and consisted of the curvature peaks computed from eachsignal-suppressed images. They then attempted to reduce suspicious region, where curvature is the local curvature ofthe prominence of normal anatomic structures from these the image data when viewed as a relief map. Their schemeimages. To reduce the false positives, they used various was evaluated based on 60 clinical radiographs, with 90features such as the size, contrast, and shape of the nodule real nodules and 288 simulated nodules. The combinationcandidates, extracted from the difference and the original of the two networks provided results of 89–96% sensitivityimages. In their study, they achieved a sensitivity of with a false-positive rate of five to seven per image.approximately 72% and a false-positive rate of five per Despite all of these efforts, none of these schemes hasimage. Thus, the number of false positives remained been applied successfully to clinical trials, because most ofrelatively high. these methods still suffer from a large number of false

Lo et al. (1993) investigated the use of neural networks positives; this is the main obstacle in existing comput-for reduction of false positives in combination with a erized detection schemes. When these schemes are set tosphere profile matching technique. In their study, they yield a clinically acceptable sensitivity of 85–90%, theselected 30 patients who had primary or metastatic cancer, number of false positives increases in the range of five towith the sizes of nodules ranging from 3 to 15 mm, and more than ten per image. Matsumoto et al. (1993) con-additional patients with no tumors. They first enhanced ducted observer studies which showed that, if a CADchest images by subtracting a nodule-suppressed image scheme had a high false-positive rate of 11 per image,from a nodule-enhanced image. The enhanced image is radiologists’ accuracy in detecting pulmonary nodules wasprocessed by a feature-extraction technique based on edge not improved when they were aided by computer output,detection and gray-level thresholding, followed by a sphere even though the scheme had a moderately high sensitivityprofile matching technique. Then a neural network is used of 80%. However, radiologists’ accuracy was significantlyto distinguish the rib crossings and round vessels from the improved if the CAD had a simulated low false-positivetrue nodules. The performance of the neural network was rate of one per image with the same sensitivity. Therefore,evaluated by ROC analysis, which yielded an average area having a low false-positive rate is critical for a CADunder the ROC curve (A ) of 0.63–0.78 in the distinction scheme to be useful.z

task. Unfortunately, no sensitivity or false-positive rate Unlike these approaches, our main method in this study,was reported. Lo et al. (1995) further extended their neural the wavelet snake, employs a priori knowledge about thenetwork approach by employing a convolution neural difference between nodules and normal structures such asnetwork, which has a simplified network structure modeled ribs when it extracts a feature. For this purpose, theon human vision. Based on 25 abnormal chest radiographs wavelet snake models the nodule as a closed-contourcontaining nodules and 30 normal chest radiographs, their object with a smooth boundary, whereas it assumes that theneural network yielded anA value of 0.83 in the ribs are open-contour, band-like objects. Combined withz

distinction task. No sensitivity or false-positive rate was multiscale edges that extract portions of the boundary of areported. nodule, this model allows the wavelet snake to distinguish

An attempt to evaluate the efficacy of a new feature was nodules from false positives effectively, as demonstrated inmade by Vittitoe et al. (1997), who used a fractal texture the following sections.feature to distinguish nodules from false positives. They The rest of this paper is organized as follows. Section 2added 30 simulated nodules to 30 normal chest radiog- presents the materials and methods for this study, includ-raphs. Also, 30 regions were selected that, although ing the database of pulmonary nodules, a process fornormal, were considered suspicious-looking for nodules. selecting the initial candidates, a process for extraction ofThrough fractal texture analysis, they concluded that the morphological features of nodule candidates, a featurefractal dimensions of simulated nodules yielded a statisti- extraction process by the wavelet snake, and a false-cally significant difference from that of the suspicious- positive reduction method based on a combination oflooking regions. features by means of ANN. Section 3 describes the

Carreira et al. (1998) first used a knowledge-based experimental results, followed by a discussion and asystem to extract lung masks over which they applied the conclusion in Sections 4 and 5.nodule detection process. Second, they obtained the nor-malized cross-correlation image to increase nodule con-spicuity, followed by detection of suspicious regions by 2 . Materials and methodsassuming a threshold. Then they analyzed the size andcircularity of suspected nodules as they grow from the 2 .1. Acquisition of digital imagescross-correlated image. Penedo et al. (1998) advanced theabove approach by incorporating a two-level ANN. The In this study, a publicly available database of digitizedfirst ANN detected suspicious regions in a low-resolution chest radiographs was used (JSRT, 1997). The databaseimage in a manner similar to the difference-image tech- was created by digitization of the clinical chest radiographs

Page 5: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 435

by Konica film digitizers LD 4500 and LD 5500. The enhanced images (by a nodule shape matched filter) andoriginal 1403170 chest radiographs were digitized to 12 signal-suppressed images (by a smooth ring filter), frombits with a matrix size of 204832048 and a pixel size of the original image (Giger et al., 1988, 1990). The back-0.175 mm. A total of 50 chest images, each of which had a grounds of the two filtered images are similar; thus, thelung nodule, were employed as a database for our CAD difference image has a relatively simple backgroundscheme (patients’ age range: 39–80 years; mean age: 62.14 yielding enhanced nodules. We then applied multiple gray-years; 30 female, 20 male patients). The mean diameter of level thresholding to the difference images to obtainthe nodules in these images was approximately 20 mm, nodule candidates. The threshold values were determinedwith a standard deviation of 7 mm. The minimum and based on the gray-level histogram of the difference image.maximum sizes of these nodules were 7 and 37 mm, The pixel values of nodules in the difference image arerespectively. The locations of the actual nodules were located at the high end of the histogram (Giger et al.,provided by thex and y coordinates. All images collected 1990; Xu et al., 1997). Therefore, we used thresholdin the database had been examined in CT images and by values that provide the percentage area under the histo-cytologic and histologic diagnosis for confirmation of the gram from the high end, called %threshold levels. Itpresence or absence of a nodule. These images were should be noted that a large % threshold level correspondsobtained with screen-film systems over a period of 3 years. to a low cutoff pixel value. At each % threshold level, the

We developed our CAD scheme based on this database. difference images were thresholded and binarized, fol-In the first step of the scheme, selection of initial candi- lowed by eight connectivity region growing, for extractiondates, we used images that were subsampled to a size of of the islands that represent nodule candidates.5123512 yielding an effective pixel size of 0.7 mm, We empirically selected 30% as the ending % thresholdbecause this resolution was found to be adequate for level, because we found that the corresponding thresholddetecting the initial nodule candidates (Xu et al., 1997). In value was low enough for detection of all nodules. Startingthe subsequent steps in our CAD scheme described in with the 1% threshold level, we applied the above thres-Sections 2.3 and 2.4, we used the original images without holding and binarization of the difference image at the 1%changing the resolution. increment of the histogram to obtain islands. For each

island at each level, two simple morphological features,2 .2. Selection of initial nodule candidates area and circularity, were calculated (see Appendix A).

2Then those islands with areas larger than 33 mm andIn the first step of our scheme, the locations of the circularity larger than 0.65 were identified as the initial

nodule candidates were identified from the chest images in nodule candidates. The minimum area was set so that onea manner described in the following. More details are could select the islands with diameter greater than 6.5 mm,presented elsewhere (Keserci, 1999; Xu et al., 1997). which is smaller than the minimum size of the nodules in

We first identified the cardiac region and ribcage edges our database (see Section 2.1). The minimum circularityin the chest images based on the analysis of the first and was set to be equal to four fifths of the minimumsecond derivatives of gray-level profiles through chest circularity of the islands corresponding to the nodules inimages (Xu and Doi, 1995). These ribcage edges were our database. It should be noted that the same candidateused for extracting the lung field, because we only targeted might appear at multiple levels. However, once a candidatethe nodules in the aerated lung region. Next, we identified was selected at a % threshold level, it would not bethe midline of the thorax by determining a straight line that examined again at the subsequent % threshold levels; thus,divides the thorax into two approximately symmetric and we avoided duplicate detections.equal parts based on the ribcages edges. In this method, Among these candidates, true positives were defined tofirst the ribcage edges on both sides of the thorax were be the set of candidates for which the centers of islandsdetected by analysis of the first and second derivatives of were within 10 mm from the true nodule locations, becausegray-level profiles through the chest image. Then the the mean diameter of the nodules in our database wasmidpoints of the corresponding locations of the left and approximately 20 mm. The resulting candidates were usedright ribcage edges were determined. Most of these as the initial nodule candidates, which consisted of approx-midpoints lay on a straight line because of the nearly imately 90% true nodules and 15 false positives per image.symmetric property of the ribcage edges on both sides of It should be noted that only the locations of the nodulesthe lungs. These midpoints were then fitted by a straight detected at this step were subjected to the next steps,line for derivation of the midline. To extract only the consisting of iris filtering and the wavelet snake.aerated areas in the lungs, we eliminated the mediastinalregion, which is an area in the center of the lung field, by 2 .3. Extraction of morphological featuresremoval of the pixels within 14 mm from the midline. Thiswidth was determined by the average size of the spine 2 .3.1. Enhancement of nodules by iris filteringshadow in the chest images in our database. In the second part of our scheme, an attempt was made

To enhance the nodules, we generated the difference to enhance the nodules effectively while suppressing theimage by subtraction of two filtered images, the signal- normal lung structures such as ribs and vessels in chest

Page 6: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

436 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

images. For this purpose, aniris filtering method provides an average concentration of the gradient vectors(Kobatake and Yoshinaga, 1996) was employed, in which to (x, y) from all directions, and it takes a value between 0

maxan adaptive filter measures the degree of concentration of and 1 because the upper and lower bounds ofc are 0i

the gradient vector in the vicinity of an operating point. and 1, respectively. When the number of gradient vectorsThe iris filtering is effective in enhancing round objects in pointing to the operating point (x, y) increases,C(x, y)an image and suppressing elongated objects, regardless of increases as well.their contrast against their backgrounds. Iris filtering is an The gradient vectors in the vicinity of the center of aadaptive filtering operation in the sense that it automatical- nodule tend to concentrate at the center because a nodulely adjusts the support of the filter to measure the maximum shadow appears to be a Gaussian-like blob on a radio-degree of concentration of the gradient vectors at the graph. On the other hand, the gradient vectors tend to beoperating point. parallel or even random on ribs and vessels, which yields a

The output of the iris filteringC(x, y) at an operating small value for the average concentration. Therefore, the2point (x, y)[R is defined by value ofC is expected to be close to a maximum of 1 at

the center of a nodule, whereas the value is expected to beN1 max close to a minimum of 0 on a rib and a vessel. As a result,]C x, y 5 O c ,s d iN pixels corresponding to a nodule tend to have large valuesi51

(1)n when the output of iris filtering,C, is displayed as an1max ]]]]c 5 max O cosc ,i ilH J image. Nodules are enhanced in this image, as demon-n2R 11R <n<Rmin max min l5Rmin strated in Fig. 4, which shows the output of the iriswhere (x, y) is the position of the pixel of interest andN filtering of the original image shown in Fig. 2.represents the number of radial lines leading out from theoperating point, as illustrated in Fig. 3. As shown in this 2 .3.2. Extraction of morphological featuresfigure, c represents the angle between theith radial After the iris filtering was applied to the original images,il

direction and the gradient vectorg , generated by a 333 we extracted ROIs in which initial nodule candidates,l

spatial difference filter, at a point that isl pixels away from identified as described in Section 2.2, were located at thethe point defined by (x 1 l cos(2p /N)(i 2 1), y 1 l sin(2p / center. Then multiple gray-level thresholding, binarization,N)(i 2 1)). Therefore, the term cosc is a measure of the and eight connectivity region growing were applied, in theil

concentration of the gradient vector to the operating point. same manner as described in Section 2.2, to these ROIs forThe maximum degree of concentration of the gradients at extracting of islands corresponding to the nodule candi-

maxeach radial directioni is represented byc , defined by dates. For each of these islands, we calculated the follow-i

Eq. (1). As a result, the output of the iris filtering,C(x, y), ing 10 morphological features characterizing nodules: area,

Fig. 4. Result obtained by applying the iris filter to the chest imageFig. 3. Illustration of a method for evaluation of the edge concentration shown in Fig. 2. The conspicuity of the nodule indicated by an arrow inby iris filtering. Fig. 2 is increased.

Page 7: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 437

perimeter, width, height, the ratio of width to height, smoothing filter. In the second process, the gradient of an2rectangularity, circularity, mean curvature of the boundary, imageI(x, y), (x, y)[R at scale j was obtained by:

and two moments (m , m ). For completeness, definitions02 20 1 1G I x, y I w x, y I ≠f x, y /≠xs d s d s s d d* *j jof these morphological features are summarized in Appen-; 52 2S D S D S Ddix A. We selected a subset of these morphological features G I x, y I w x, y I ≠f x, y /≠ys d s d s s d d* *j j

that yielded the highest performance in discrimination of →5= I f x, y ,true nodules from false positives. The selection method s ds d* j

and the resulting performance are described in Section 3.3. ]l l j jŒw x, y 5w s x, y /2 d / 2 ,s d s dj

2 .4. Feature extraction based on wavelet snake model ]j jŒf x, y 5fs x, y /2 d / 2 ,s d s dj

In the third part of our scheme, in parallel to the second wheref (x, y) represents the spline function at scalej, andjlpart, a feature called aweighted overlap of a nodule was w (x, y), l 51, 2, represents the edge detectors at scalej

1extracted by use of an edge-guided wavelet snake model.with orientationx (l 5 1) andy (l 5 2) defined byw ; ≠c /2This feature extraction process consisted of the following ≠x andw ; ≠c /≠y. The symbol indicates the convolution

three steps: (1) extraction of partial nodule boundaries by a operationf(x) g(x);e f(u) g(x 2 u) du. Fig. 5(a) shows an*multiscale edge detector, (2) estimation of nodule example of an ROI,I(x, y), of a chest radiograph con-boundaries by the wavelet snake model, and (3) calcula- taining a nodule at its center portion. Fig. 5(b)–(e) show

1tion of the weighted overlap between the snake and the the x-directional gradientsG I(x, y) at scalesj51 to 4.jmultiscale edges. These processes are described in se-Similarly, Fig. 5(f)–(i) show they-directional gradients2quence in the following subsections. G I(x, y) at scalesj51 to 4.j

Because the direction of the gradient vector of a point2 .4.1. Extraction of nodule boundaries by multiscale (x, y) indicates the direction along which the imageI(x, y)edge detector has the steepest slope, a point (x, y) can be considered an

First, an attempt was made to extract a portion of the edge point at scalej if the magnitude of the gradients,boundary of a nodule in digital chest images. For this

]]]]]]]1 2 2 2purpose, we employed amultiscale edge detection (Mallat M I x, y ; G I x, y 1 G I x, y ,s d s d s du u u uj j jœand Hwang, 1992; Mallat and Zhong, 1992) method based

attains a local maximum along the gradient direction givenon local maxima of multiscale gradients. The main advan-bytage of the multiscale edges is that features such as the

2 1length and intensity of the edges can characterize variousA I x, y ; arctanG I x, y /G I x, y .s d s d s ds dj j jstructures in chest radiographs. For example, multiscaleedges with very short lengths and low intensities may be Therefore, in the third step, local maximum points werecaused by background noise. Larger edges at medium connected by an eight-connectivity test to form edgescales with higher intensities may represent the boundary segments. We call the resulting sets of edge segments theof a nodule, whereas long edges at large scales mayedge representation at scale j of the imageI(x, y). A set ofcorrespond to large structures such as vessels and ribs in the edge representations at multiple scales is called thethe chest radiographs. Therefore, by keeping only the multiscale edge representation. Figs. 6(a)–(d) show theedges in a certain length range and intensity range, we can magnitude of the gradientsM I(x, y) at scales 1 to 4j

extract the nodule boundaries effectively, as demonstrated obtained from Figs. 5(b)–(i). Figs. 6(e)–(h) show the edgelater in this subsection. representations at scales 1 to 4 corresponding to Figs.

The main processes involved in the multiscale edge 6(a)–(d). As shown in Fig. 6(g), the boundary of thedetection are: nodule is well extracted at scale 3. Edge representations at1. Smoothing an image by a convolution with a cubic scales 1 and 2 are noisy, and the boundary of the nodule is

spline function; less clear than the boundary at scale 3, as demonstrated in2. Computing the gradient vector at each point of the Figs. 6(e) and (f). As shown in Fig. 6(h), the edge

smoothed image; representation at scale 4 provides a rough outline of the3. Detecting the sharp variation points where the mag- nodule.

nitude of the gradient vector is maximum along the In the fourth process, edges at individual scales weredirection of the gradient vector, and recording (a) the thresholded by a certain length and average intensity forposition of each of these local maxima, (b) the value of removal of short and/or low-intensity edges due to noisethe magnitude at the corresponding location, and (c) the in the original image. For example, Figs. 6(i)–(l) show theangle at the location; multiscale edge representation obtained by removal of

4. Eliminating the short and/or low-intensity edge seg- edges shorter than 10 pixels in length from Figs. 6(e)–(h).ments. As a result of this thresholding, most of the short edges are

In the first process, a cubic spline functionf was used as a eliminated, and relatively long and clear edges due to the

Page 8: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

438 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

Fig. 5. Multiscale gradients of an ROI containing a nodule at the center. (a) An example of an ROI,I(x, y), of a chest radiograph containing a nodule at the1 2center portion. (b)–(e) Thex-directional gradientsG I(x, y) at scalesj51 to 4. (f)–(i) They-directional gradientsG I(x, y) at scalesj51 to 4.j j

Fig. 6. Magnitude of the multiscale gradients, the multiscale edges, and the reduced multiscale edges. (a)–(d) The magnitude of the gradientsM I at scalesj

1 to 4. (e)–(h) The edge representations at scales 1 to 4 corresponding to (a)–(d). (i)–(l) The multiscale edge representation obtained by removal of edgesshorter than 10 pixels in length from (e)–(h).

Page 9: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 439

nodule boundary and rib edges remain. In the following, region,M , with a mean valuem , surrounded by a whitein in

the thresholding operation is denoted byT , whereu and region, M , with a certain mean pixel value ofu,p out

p are the thresholded values for the length and averagem (m .m ).out out in

intensity of an edge segment, respectively. By using this In order to identify the boundary of these two regions,notation, we can express the thresholded edge representa- we used the following cost function:tion at scalej as T M I(x, y).u,p j

H 5H 1H ,in out

© © ©22 .4.2. Estimation of nodule boundaries by edge-guided H ;H K, s ; m 5E Ks x d2m dx ,s d s din in in inwavelet snake model

A (3)inAlthough the multiscale edges are effective in extracting© © ©2portions of the boundaries of nodules, they do not neces-H ;H K, s ; m 5 E Ks x d2m dx ,s d s dout out out out

sarily provide a complete description of the boundary of a A out

nodule. Therefore, we developed an edge-guided wavelet©snake model for extraction of a boundary feature that is where Ks x d is the pixel value of a given imageK at

©effective in reducing of false positives. x 5 (x, y). The integration of the first term is performed2A continuous snake on a planes [R is defined by over A , which is defined as an internal area cir-in

→ cumscribed by the snakes, and the integration of the1 1s ds t,v→ second term is calculated overA , which is defined as ans 5 , out→S D2 2s ds t,v area outside the snake. Here, no specific sizes are assumedfor A and A . If K represents the model imagein outwhere t [ [0, 1] is a contour parameter of the snake, and

© l described above, and a wavelet snake is placed on thisw (l 5 1, 2) are parameters that determine the shape of the

© l image, there are three possibilities for the value ofH:snake. Later in this subsection,w are shown to be the

1. If the snake is completely confined insideM , the valueinwavelet coefficients of the snake contour. Let us assumeof the cost functionH is greater than zero because

w(x) andf(x) to be a one-dimensional orthogonal waveletH 50 andH . 0;in outon [0, 1) (Daubechies, 1992). The waveletw (x) atjk 2. If the snake is completely outsideM , the value of theinresolution levelj and translationk is defined by dilationcost functionH is also greater than zero becauseH .inand translation of the mother waveletw(x) as follows:0 andH 50;out

j j 3. If the snake fits the boundary of the two regionsMf (x)5fsx /2 2 kd /2 . inj,kand M , the cost functionH becomes zero, becauseout

In a discrete formulation, if one divides the range of the H and H become zero.in outcontour parametert into N intervals with equal width, such Therefore, the wavelet snake can fit the boundary of theas t 5 n /N, 0< n <N 2 1, the wavelet snake with lengthn two regionsM andM by minimization ofH. As notedin outN can be defined by previously, we used the edge representation at a particular

L2j scale as the imageK. Therefore, we consider the regionL 2 21©→1 1 1 1 inside the nodule as zero for the edge representation, ands d s ds w c 1O O w w tn j,k j,k

→ j51 k50 thus m and m are set to zero and to a certain positivein outs 5 5 ,L2jL 2 21 value for the cost function, respectively.©→2 2 2 21 2 1 2s d s ds w c 1O O w w tn j,k j,k We used a gradient descent algorithm to perform thej51 k500<n<N21

minimization of the cost functionH by changing theN21 k1l l wavelet coefficients,w , (i 5 1, 2) as follows:i, j]c ; O s , (2)nN n50

≠Hi i L2j]]w ←w 2h , J < j < L, 0< k < 2 21, (4)l l l l j,k j,k iwherec (l 5 1, 2) is the mean value ofs , andw 5 w .h j ≠wn j,k j,kIn Eq. (2), the constantL represents the maximum scale ofthe wavelet transform defined byL 5 log N. Here, the whereh is a small constant specifying the step size in the2

length of the wavelet snakeN, which is the number of gradient descent algorithm andJ represents the minimumdiscrete points, is assumed to be a power of two in order to scale of the wavelet coefficient being updated. As shown in

iallow the fast wavelet transform to be used. Appendix B, the partial derivative≠H /≠w is expressed inj,k

The deformation of the wavelet snake is performed in a a closed form by use of the fast wavelet transform.model-based manner. In this study, we used the Moreover, the sparse representation provided by the wave-thresholded edge representation at a particular scale, let transform allows only a small number of waveletT M I(x, y), as an imageK on which the snake acts. We coefficients, controlled by the minimum scaleJ in Eq. (4),u,p j

used the wavelet snake to fit the multiscale edges to to be used for the deformation of the snake, thus reducingestimate the boundary of a nodule in the image. Therefore, the complexity of the snake model.the model image was defined as an arbitrarily shaped black The multiscale edges described in Section 2.4.1 were

Page 10: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

440 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

used to ‘guide’ the deformation of the wavelet snake for an overlap that measures the agreement between the fittedestimation of the boundary of a nodule. Before application snake and the edge representation at scalej was calculated.of the wavelet snake, only those edges surrounding the Here, the weighted overlap is defined as the sum of pixelcenter were thickened outward along the radial direction values on the discrete points on the snake overlapped withfrom the center of the image. We performed this by first edges as follows:drawing several radial lines leading out form the center,

Nidentifying the edge point on each radial line that is closest →D 5O T M Is s d.j u,p j lto the center, and adding two pixels with the same intensity

l51as that of the edge point to both sides of the point alongthe radial line. This edge-thickening process has the effect Then the first local maximum is identified as the repre-of stabilizing the deformation process of the wavelet snake sentative weighted overlap for that ROI. Fig. 8 is anwhen it fits the edges. Otherwise, the wavelet snake may example of the plot of the weighted overlap at scale 3 as a‘stride over’ the edges, because the overlap between the function of the number of iterations obtained from theone-pixel-wide edges and the snake may be small even ROIs shown in Fig. 7, in which the representative weight-though the snake is close to the edges. Then the wavelet ed overlap is shown by black circles. In the rest of thesnake, which was initially placed at the center of the paper, the representative weighted overlap is simply re-image, was grown outward by iterative application of the ferred to as the weighted overlap.update rule in Eq. (4), until the snake fitted the thickened In general, the weighted overlap for nodules appears toedges. be higher than those of false positives because the wavelet

Figs. 7(a) and (c) show examples of ROIs containing a snake is designed to capture the boundary of a nodule-liketrue nodule and a false positive, respectively. Figs. 7(b) object. On the other hand, if the edges consist of a largeand (d) show the shapes of the wavelet snake at several number of irregular curves due to normal structures, theyiterations when the edge representations at scalej53 of may not be fitted well by the wavelet snake, and yield aFigs. 7(a) and (c) are used. When the snake is applied to a divergent snake with a low degree of weighted overlap.nodule, the fitted snake is compact and identifies the Therefore, the weighted overlap was used as a measure forboundary of the nodule well, as demonstrated in Fig. 7(b). distinction between true nodules and false positives.On the other hand, when the snake is applied to a false It should be noted that, if we use the edge representationpositive, the fitted snake is rather divergent and only partly at only a single scale, the overall performance of ourfits the edges due to ribs, as shown in Fig. 7(d). method is limited because of the fuzziness of the nodule

boundaries, or the interference by noise and normal2 .4.3. Snake feature: weighted overlap between snakes structures. Therefore, a combination of edge representa-and multiscale edges tions at two scales was employed. For this purpose, the

At each iteration of the snake evolution, theweighted weighted overlap at scale 3 (D ) was calculated for a given3

ROI. If the weighted overlap was smaller than a certainthreshold level, the weighted overlap at scale 4 (D ) was4

Fig. 7. (a) and (c) Examples of ROIs containing a true nodule and a falsepositive, respectively. (b) and (d) The shapes of the wavelet snake at Fig. 8. An example of the change of the weighted overlap between theseveral iterations when the edge representations at scalej53 of (a) and wavelet snake and the edges at scale 3 as a function of the number of(c) are used. iterations for the snake evolution.

Page 11: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 441

also calculated. Then the maximum value ofD and D performance of the discrimination method being tested3 4

was used as the weighted overlap for that ROI, which is (Metz, 1989). Generally, a largerA indicates a betterz

called the weighted overlap for the ROI in the rest of this overall performance in distinguishing between true andpaper. false positives. If the wavelet snake effectively fits the

boundaries of the nodules, and does not fit edges in the2 .5. Combination of features by artificial neural network normal structure, then the weighted overlap of an abnormal

ROI becomes high, whereas that of a normal ROI is low;To achieve a high performance in the reduction of false this yields a largeA value (maximum: 1.0). On the otherz

positives, we combined the weighted overlap obtained hand, if the method does not differentiate between thefrom the wavelet snake and a subset of morphological structures in the ROIs, then the weighted overlap takesfeatures by ANN (Haykin, 1999). The ANN used in this similar values for both normal and abnormal ROIs, andstudy was a conventional multilayer feed-forward percep- this yields a lowA value (minimum: 0.5).z

tron trained by back-propagation. To achieve a highgenerality, we used a two-layer ANN, in which the number 3 .2. Performance obtained by wavelet snake andof input neurons was equal to the number of input features. morphological featuresA single output neuron was used in this ANN. The numberof neurons in the hidden layer was set to (11number of A total of 50 chest images in our database were used forinput neurons) /2. evaluation of the performance of our CAD scheme. The

parameter setting in the initial candidate selection stepdescribed in Section 2.2 yielded 45 actual nodules and 748

3 . Experimental results false positives as initial nodule candidates, which providedan initial sensitivity of 90% with a false-positive rate of 15

23 .1. Method for evaluation of performance in reduction per image. Then an ROI with a matrix size of 45345 mmof false positives that had a center at the location of the center of the

candidate was extracted from the original chest images.As described in Section 1, the most important indication Consequently, 45 abnormal ROIs and 748 normal ROIs

of the performance of a false-positive reduction method is were extracted at this stage.its ability for distinguishing between nodules and false To evaluate the performance of the wavelet snake, wepositives. In this study, therefore, ROC analysis (Metz, subjected these 793 ROIs to the multiscale edge detection2000) was employed for evaluation of the performance of process described in Section 2.4.1. In this process, thethe computerized scheme in the distinction between ROIs threshold for the length of the edgesu was set to 15, andas either containing nodules or belonging to the normal that for the average intensityp was set tom 22s, wherembackground structures. ROC analysis is a statistical analy- ands are the mean and the standard deviation of thesis technique that estimates the true-positive fraction intensity of the edges in the ROI, respectively. Theseobtained by a method as a function of the false-positive threshold values were set so that most of the edges due tofraction. In the following, we describe the process of ROC nodules are kept in the abnormal ROIs. Then the waveletanalysis for the weighted overlap, however, the same snake in Section 2.4.2 was applied to these multiscale-procedure applies to the morphological features as well as edge-detected ROIs, and the weighted overlap was mea-to the output of the ANN. sured for each ROI. We experimented with several differ-

In a given set of ROIs, those ROIs containing actual ent types of wavelets, including Daubechies, Symmlet,nodules are calledabnormal ROIs, and the remaining Meyer, Battle–Lemarie, and Coifet wavelets with variousROIs containing only normal structures are callednormal vanishing moments (Daubechies, 1992). Little differenceROIs. Given a threshold value, we call an ROIpositive if in the results from the different types of wavelets wasthe weighted overlap is greater than a certain threshold observed. However, Coifet with support 18 was found tovalue; otherwise, the ROI is callednegative. The true- yield a slightly higher performance than that of the others,positive fraction is then defined as a fraction of abnormal probably because it is nearly symmetric and smoothROIs correctly identified as positive, and thefalse-positive enough. Therefore, we used Coiflet 18 as the motherfraction is defined as the fraction of normal ROIs incor- wavelet that defines the wavelet snake in Eq. (2). For therectly identified as positive. The ROC analysis was per- parameters in Eq. (3),m was set to 0, andm wasin out

formed by subjecting of the weighted overlap from in- adaptively set to the mean value of the gray levels of thedividual ROIs to the LABROC4 program developed by edges in the ROIs. The length of the snakeN was set toMetz (1986), which generates an ROC curve. The most 256, although lengths of 128 or 512 produced the samecommonly employed univariate summary of an ROC curve results. The minimum scale to be updated,J, was set to 6,is the area under curve, defined as the area under the entire so that only 8 wavelet coefficients were used for changingROC curve when it is plotted in a unit square. This index, the shape of the snake. Finally, the weighted overlapdenoted byA , indicates an unbiased estimation of the described in Section 2.4.3 was measured for each ROI.z

Page 12: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

442 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

Fig. 9. Histogram of the weighted overlap for the 793 ROIs, whichconsisted of 45 nodules and 748 false positives.

Fig. 10. ROC curves and theirA values, showing the performance ofzFig. 9 is a histogram of the weighted overlap for the 793three false-positive reduction methods in distinguishing between nodules

ROIs. The weighted overlaps for the abnormal ROIs tend and false positives: weighted overlap by wavelet snake, combination ofto be higher than those for the normal ROIs. The mean ten morphological features alone, and combination of the weighted

overlap and ten morphological features by use of an artificial neuralvalue of the weighted overlap for the abnormal ROIs wasnetwork.165.5 with a standard deviation of 36.6, whereas the values

for the normal ROIs were 126.1 and 32.5, respectively. Atwo-tailed t-test showed that the difference between thetwo means was statistically significant (P,0.001). Thehistogram in Fig. 9 indicates that, despite a large overlap 3 .3. Performance of features combined by artificialbetween the abnormal and normal ROIs, the wavelet snakeneural networkcould have a high performance in eliminating false posi-tives; if the ROIs with a weighted overlap less than 90, As described in Section 2.5, features obtained with theindicated by a dotted line, were removed as false positives, wavelet snake and a subset of the 10 morphologicalthis method was able to eliminate approximately 20% of features was combined by ANN to achieve a high per-the false positives with loss of only one (2%) nodule. The formance in the reduction of false positives. To estimate anlowest curve in Fig. 10 is an ROC curve generated from average performance of the combined features, we used athe histogram in Fig. 9 by use of the weighted overlap as a jackknife test (Efron, 1982) for the training and testing ofdecision variable. This ROC curve yielded anA value of the neural network. In this method, one half of the ROIsz

0.78. were selected randomly from the entire set of ROIs as aThe performance of the 10 morphological features training set for the ANN, and the other half were used as a

described in Section 2.3.2 and Appendix A was evaluated testing set for evaluation of the performance of the trainedby ROC analysis in the same manner as described above.Here, the parameters for the iris filter in Eq. (1) were set to

Table 1be R 5 7 mm,R 540 mm, andN 5 32, based on themin maxA values obtained from individual morphological features for discrimina-zminimum and maximum sizes of the nodules intended totion between true and false positivesbe detected in our database. It should be noted, however,Parameter A valuethat the effective support of the iris filter is adaptive due to z

the maximum operation in Eq. (1), although the upper and Area 0.65Perimeter 0.61lower limits of the support are bounded by the parametersWidth 0.67R and R . Table 1 lists the A values for themin max zHeight 0.56performance of the 10 morphological features in dis-Width/height 0.53

criminating between true and false positives. This table Rectangularity 0.53indicates that the weighted overlap by wavelet snake yields Circularity 0.57

Curvature 0.53a much higher performance in distinguishing true nodulesMoment (m ) 0.6402from false detections than do any of these conventionalMoment (m ) 0.5520morphological features.

Page 13: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 443

ANN. The performance was measured by theA values indicating the performance of the three false-positivez

obtained by ROC analysis of the output of the ANN. The reduction methods, i.e. the weighted overlap obtained byjackknife test was repeated 40 times by random partition- wavelet snake, the 7 morphological features combineding of the set of ROIs for training and testing. with an ANN, and a combination of all of these features

Fig. 11 shows the curves representing the change in the with an ANN. The maximum performance was obtainedA values for the testing set along with the number of by the ANN that combines the weighted overlap withz

iterations for the training of the ANN. These curves were seven morphological features, which yielded anA valuez

obtained by averaging, at each iteration, of theA values of 0.87. An operating point on the curve shows a sensitivi-z

from all the jackknife sessions. The bottom curve was ty of 93% with a specificity of 50%. This result indicatesobtained by combining a subset of the 10 morphological that the combination of the wavelet snake and the mor-features by ANN. The maximumA value, which is phological features can eliminate 50% of false positivesz

regarded as the representative performance (Smith, 1993), with loss of 7% of nodules when it is applied to the initialshowed anA value of 0.83, as indicated by a black circle. nodule candidates.z

This subset consisted of 7 features: area, perimeter, theratio of width to height, circularity, mean curvature of theboundary, and two moments (m , m ). We selected these 3 .4. Overall performance of the computerized detection02 20

features, among all possible 1014 combinations of the 10 schememorphological features, as a set of features that yield thehighest A value when combined by ANN and evaluated Free receiver operating characteristic (FROC) analysisz

by the jackknife test. On the other hand, the top curve was was employed for evaluation of the overall performance ofobtained by subjecting of both the weighted overlap and our CAD scheme in the detection of pulmonary nodules inthe selected 7 morphological features to the ANN, which digital chest radiographs. In the FROC analysis, generally,yielded a maximumA value of 0.87. This performance is a parameter in a CAD scheme is varied to show the changez

substantially higher than that obtained by combination of in the sensitivity and the average number of false positivesthe 7 morphological features alone by ANN. Because these per image yielded by the scheme. As a result, a FROCA values were averages of 40 jackknife sessions, a two- curve is obtained in which the sensitivity is plotted as az

tailed t-test was carried out, which showed that the function of the average number of false positives perdifference between the twoA values was statistically image.z

significant (P,0.001). We used a threshold value for the output of the ANN,Fig. 10 shows the ROC curves and theirA values which combines the weighted overlap and the morphologi-z

cal features, as the parameter for generating an FROCcurve. The output value of the ANN for a given ROIranges from 0 to 1, where 0 indicates that the ROI isnegative, and 1 indicates that the ROI is positive. For agiven set of ROIs, we can calculate the true- and false-positive rate by setting a threshold value between 0 and 1,and regarding the ROIs with ANN output larger than thethreshold value as ‘positive’ and those below the thresholdvalue as ‘negative’. Changing the threshold value from 0 to1 changes the true- and false-positive rate of the CADscheme, and thus generates an FROC curve. Because thetop ROC curve in Fig. 10 was generated by a jackkniferesampling method, this ROC curve provides the estimatedchange of the true- and false-positive rates when thethreshold value for the output of the ANN is changed overthe population of initial candidates. Therefore, we appliedthe pairs of true- and false-positive fractions in this ROCcurve to the initial nodule candidates to generate the FROCcurve shown in Fig. 12.

As shown in this FROC curve, the CAD scheme yieldedan estimated sensitivity of 70% at an average number of

Fig. 11. Curves representing the change in theA values obtained fromz false positives of three per image, which represents thethe output of an ANN for the testing set as a function of the number of performance obtained by the entire CAD scheme based oniterations used for training. The top curve shows theA values yielded byz the combination of the wavelet snake and the 7 mor-an ANN that combines the weighted overlap obtained by the wavelet

phological features. For comparison, two additional FROCsnake and the 10 morphological features. The bottom curve shows theAz

values yielded by combination of 10 morphological features alone. curves were generated. The middle FROC curve was

Page 14: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

444 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

outlier is excluded, theA value will be increased to 0.80,z

which is comparable to the performance of the combina-tion of morphological features.

We reduced the dimensionality of the original set ofmorphological features from 10 to 7 by selecting thefeatures that yielded the highest performance in thediscrimination task. Nevertheless, the wavelet snake wasshown to increase the discrimination performance substan-tially when the weighted overlap was combined with theseven morphological features. The implication of this isthat the wavelet snake provides a feature that does notcorrelate with, but rather complements these morphologi-cal features. This is a substantial advantage in increasingthe overall performance because, if one adds an arbitraryfeature that might be correlated with morphological fea-tures, the added feature might be useless or might evendegrade the performance of the discrimination because ofFig. 12. FROC curves showing the performance of different CADthe curse of dimensionality.schemes in the detection of lung nodules in chest radiographs. The top

When the combined method was used in the com-FROC curve shows the performance of a CAD scheme in which an ANN,which combines the degree of overlap obtained by the wavelet snake andputerized detection scheme, it yielded a moderately highthe ten morphological features, was used as a method for reduction of sensitivity of 70% at a low false-positive rate of 3 perfalse positives. The middle and bottom FROC curves show the per-

image, which is comparable to that of the existing methodformance of CAD schemes that use the combination of ten morphologicalfor the detection of pulmonary nodules (Xu et al., 1997;features alone and the weighted overlap by wavelet snake, respectively, asCarreira et al., 1998). Currently, comparisons of themethods for reduction of false positives.

various methods are not possible because of the differentdatabases of images and the different means used for

obtained with the 7 morphological features alone, which evaluation. Considering the simplicity of the detectionshows average false positives of 4.5 per image at an methods used in our CAD scheme, the fact that some ofestimated sensitivity of 70%. The bottom curve is a FROC the true nodules were already eliminated at the initialcurve generated for the wavelet snake alone, which yielded candidate selection step, and a large reduction of thean average false-positive rate of 5.5 per image at the same false-positive rate (from 15 to 3 per image) was achieved,sensitivity. As shown in Fig. 12, the performance of the this overall performance indicates the efficacy of the false-CAD scheme based on the combination of the wavelet positive reduction methods used in our scheme. We expectsnake and seven morphological features outperformed that our scheme will perform better than their schemes ifthose of the other two FROC curves. the additional processes used in their schemes, such as a

Once the training of the ANN was completed, the total background trend correction, feature analysis based on acomputation time for processing one case was, on average, difference-image technique, nodule segmentation based on1.3 min on a personal computer (Dell Dimension XPS transition-point analysis, are added.R450, Dell Computer Corporation, Round Rock, TX) witha single processor (Pentium III, 450 MHz, Intel Corpora-tion, Santa Clara, CA). 5 . Conclusion

We have developed a novel scheme for automated4 . Discussion detection of pulmonary nodules in chest radiographs. After

selection of initial nodule candidates, two processes wereAs shown in Fig. 10, the weighted overlap by wavelet applied in parallel for reducing the false positive de-

snake yielded anA value of 0.78 in the discrimination tections. One process consisted of iris filtering for en-z

between true and false positives. Although thisA value is hancement of nodules and suppression of normal lungz

lower than that obtainable from the combination of 7 structures, followed by extraction of conventional mor-morphological features, the weighted overlap appears to be phological features. The other process consisted of a newan effective feature for reduction of false positives because approach for elimination of false positives called theits A value is higher than that for any single morphologi- wavelet snake. In this process, multiscale edges of thez

cal feature shown in Table 1. Moreover, this seemingly candidate nodules were extracted to yield parts of thelow A value for the wavelet snake was due to a nodule nodule boundaries. The wavelet snake was then used forz

case with a very fuzzy boundary, which had a very low fitting of these multiscale edges for approximation of thedegree of weighted overlap of 45 (see Fig. 9). If this true boundaries of nodules. A feature called the weighted

Page 15: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 445

overlap between the snake and the multiscale edges was whereA is the area of the candidate nodule.calculated and used for elimination of false positives. • Circularity: Circularity is defined byFinally, the weighted overlap obtained from the wavelet area

]]]snake and a set of 10 morphological features were com- C 5 34p.perimeterbined by use of a two-layer ANN trained by back-propaga-tion. Circularity takes on a maximum value of 4p for a

ROC analysis was employed for evaluation of the circular shape and yields lower values for more com-performance of the combined features, which yielded a plex shapes.high A value of 0.87 in discriminating between true and • Curvature: Let us suppose that the distance around thez

false positives. When used in the entire scheme, the boundary of a nodule candidate is measured from acombined features achieved a large reduction of the false- starting point with a variablek. At any point, thepositive rate: from 15 false positives per image to 3 per boundary has an instantaneous radius of curvaturer(k),image. We therefore believe that our new method, in which is the radius of the circle tangent to the boundaryparticular the false-positive reduction method based on the at that point. Then the curvature at point k is given bywavelet snake, is effective in improving the performance

1of our computerized scheme for detection of pulmonary ]K(k)5 .r(k)nodules in chest radiographs.

• Moment: The moment of a nodule candidateR isdefined by

A cknowledgementsi j¯ ¯m 5E x 2 x y 2 y dx dy,s d s dij

RThis work is supported by USPHS Grants CA85668 andACS Research Project Grant RPG-0006301CCE. The where (x, y) is the center of mass of the noduleauthors are grateful to John H. LeVan, PhD, Allen H. candidate, andi and j take on all nonnegative integerHrejsa, PhD, Kyung S. Han, PhD, and Ernest Sukowski, values. In this study, we used the following moments:PhD, for their useful suggestions and discussions. The

0 2authors also thank the members of the Kurt Rossmann ¯ ¯m 5E x 2 x y 2 y dx dy,s d s d02laboratories in the Department of Radiology at The Uni- R

versity of Chicago for useful discussions.2 0¯ ¯m 5E x 2 x y 2 y dx dy,s d s d20

R

A ppendix A. Morphological features A ppendix B. Derivation of the partial derivative ofthe cost function

The definitions of the 10 morphological features intro-iThe partial derivative of the cost function≠H /≠w canduced in Section 2.3.2 are described in this appendix. In j,k

be obtained by use of the orthogonal wavelet transform. Ifthe following, ‘nodule candidate’ indicates an island2 2K ; K 2m andK ; K 2m , we can rewrite thes d s dcreated after binarization and eight-connectivity region in in out out

cost functionH defined in Eq. (3) as follows:growing of an image.• Area: The area of a nodule candidate is the number of → →

H 5E K dx 1 E K dxpixels within the contour of the island that represents in out

A Athe candidate. in out

• Perimeter: This is the number of pixels along the→ → →

5E K 2K dx 1 E K dx 1 E K dxs dmargin of the nodule candidate, taking into account the in out out outF Goutside edges of the pixels. A A Ain in out

• Width and height: The width of a nodule candidate is→ →

5E h dx 1E K dx ,the width of the minimum enclosing rectangle (MER) out

around the candidate nodule. The length is the height of A Ain

the MER. 2 2h ;K 2K 5 K 2m 2 K 2m , (B.1)s d s din out in out• Rectangularity: Rectangularity is the ratio of thecandidate nodule’s area to the area of the nodule’s whereA is the area circumscribed by the snake,A isin out

minimum enclosing rectangle and is given by the area outside ofA , and A; A 1 A is the entirein in out

area of the given imageK. Because the second term in theA first equation in Eq. (B.1) is a constant, we denote that]]]]R5 ,width3 height term by

Page 16: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

446 B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447

R eferences→

C ;E K dx .out

ACS, 1999. Cancer Facts and Figures. American Cancer Society.A

Carreira, M.J., Cabello, D., Penedo, M.G., Mosquera, A., 1998. Com-puter-aided diagnosis: automatic detection of lung nodules. Med. Phys.Let us define:25, 1998–2006.

Daubechies, I., 1992. Ten Lectures On Wavelets. SIAM, Philadelphia, PA.→˜ Efron, E., 1982. The Jackknife, the Bootstrap, and Other ResamplingH ;H 2C 5E h dx .plans. Society for Industrial and Applied Mathematics, Philadelphia,

A in PA.Fontana, R.S., Sanderson, D.R., 1975. The Mayo lung project for early

Then the partial derivative ofH with respect to a wavelet detection and localization of bronchogenic carcinoma: a status report.icoefficientw is equal to that ofH, Chest 67, 511.j,k

Forrest, J., Friedman, P., 1981. Radiologic errors in patients with lung1 cancer. West. J. Med. 134, 485–490.˜≠H ≠H Giger, M.L., Doi, K., MacMahon, H., 1988. Image feature analysis and]] ]]5 5E h J dt, (B.2)u ui i computer-aided diagnosis in digital radiography: 3. Automated de-≠w ≠wj,k j,k 0 tection of nodules in peripheral lung fields. Med. Phys. 15, 158–166.

Giger, M.L., Doi, K., MacMahon, H., Metz, C.E., Yin, Y., 1990.where J is a Jacobian for the change of parameters, whichu u Pulmonary nodules: computer-aided detection of pulmonary nodules in

digital chest images. Radiographics 10, 41–52.is defined byHaykin, S., 1999. Neural Networks. Prentice Hall, Englewood Cliffs, NJ.

1 2 JSRT, 1997. Standard Digital Image Database: Chest Lung Nodules and≠s ≠sNon-nodules. Japanese Society of Radiological Technology.] ] 1 2 1 2≠t ≠t ≠s ≠s ≠s ≠s Keserci, B., 1999. Computer-Aided Diagnosis Protocol for Detection of

]]] ]]]1 2J 5 5 2 . (B.3)u u i i Pulmonary Nodules in Chest Radiographs Based on Edge-Guided≠s ≠s ≠t ≠t≠w ≠wj,k j,k* *]] ]] Wavelet Snake. PhD thesis, Finch University of Health Sciences/Thei i≠w ≠wj,k j,k Chicago Medical School, Chicago, IL, USA.

Kobatake, H., Yoshinaga, Y., 1996. Detection of spicules on mammogramlWe assume the mean valuec (l 5 1, 2) in Eq. (2) to be based on skeleton analysis. IEEE Trans. Med. Imag. 15, 235–245.

Kobayashi, T., Xu, X.W., MacMahon, H., Metz, C.E., Doi, K., 1996.zero without loss of generality. If we also assume that theEffect of a computer-aided diagnosis scheme on radiologists’ per-waveletsw in Eq. (2) are orthogonal, we can obtain thejk formance in detection of lung nodules on radiographs. Radiology 199,following formula:843–848.

Kundel, H.L., Nodine, C.F., 1980. Interpreting chest radiographs withoutl≠s ≠ l l visual search. Radiology 116, 527–532.]] ]]5 O w w 5d w . (B.4)i i p,q p,q i j,kH J Kundel, H.L., Revesz, G., 1976. Lesion conspicuity, structured noise, andp,q≠w ≠wj,k j,k

film reader error. Am. J. Roentgenol. 126, 1233.Lampeter, W., Wandtke, J., 1986. Computerized search of chest radio-

In this equation,d is the Dirac delta function. Substitutingij graphs for nodules. Invest. Radiol. 21, 384–390.Eqs. (B.3) and (B.4) into Eq. (B.2), we obtain Lo, S., Freedman, M., Lin, J., Mun, S., 1993. Automatic lung nodule

detection using profile matching and back-propagation neural network1 techniques. J. Digit. Imag. 6, 48–54.1 2

≠H ≠s ≠s2 1 Lo, S., Lou, S., Lin, J.S., Freedman, M.T., Chien, M.V., Mun, S.K., 1995.H J]] ] ]5E h w d 2 h w d dti j,k i j,k i≠t ≠t Artificial convolution neural network techniques and applications for≠wj,k 0 lung nodule detection. IEEE Trans. Med. Imag. 14, 711–718.1 2 MacMahon, H., Engelmann, R., Behlen, F.M., Hoffmann, K.R., Ishida,≠s ≠s2 1F S DG F S DG T., Roe, C., Metz, C.E., Doi, K., 1999. Computer-aided diagnosis of] ]5 W h d 2 W h d , (B.5)i ij,k j,k≠t ≠t pulmonary nodules: results of a large-scale observer test. Radiology

213, 723–726.where W( f ) indicates the wavelet coefficient obtainedf g j,k Mallat, S., Hwang, W.L., 1992. Singularity detection and processing withby the wavelet transformW defined by wavelets. IEEE Trans. Inf. Theory 38, 617–643.

Mallat, S., Zhong, S., 1992. Characterization of signals from multiscale1 edges. IEEE Trans. Pattern Anal. Machine Intell. 14, 710–732.

Matsumoto, T., Yoshimura, H., Giger, M.L., Doi, K., MacMahon, H.,W f 5E f t w t dt.f s d g s d s dj,k j,kMontner, S., Nakanishi, T., 1992. Potential usefulness of computerized

0 nodule detection in screening programs for lung cancer: A pilot study.Invest Radiol. 27, 471–475.

We assume that the waveletsw and the snakes arejk Matsumoto, T., Doi, K., Kano, A., Nakamura, H., Nakanishi, T., 1993.periodic on the interval [0,1]. Evaluation of the potential benefit of computer-aided diagnosis (CAD)

Eq. (B.5) provides the formula for the derivative of the for lung cancer screenings using photofluorography: Analysis of anobserver study. Nippon Acta Radiol. 53, 1195–1207.cost function with respect to a wavelet coefficient. The

l Metz, C.E., 1986. ROC methodology in radiographic imaging. Invest.derivative ≠s /≠t should be replaced by the differenceRadiol. 21, 720–733.

between the adjacent points in the discrete formulation. Metz, C.E., 1989. Some practical issues of experimental design and dataThe computation of the derivative in Eq. (B.5) can be analysis in radiological ROC studies. Invest. Radiol. 24, 234–245.performed quickly by means of the fast wavelet transform. Metz, C.E., 2000. Fundamental ROC analysis. In: Beutel, J., Kundel,

Page 17: Computerized detection of pulmonary nodules in chest radiographs based on morphological features and wavelet snake model

B. Keserci, H. Yoshida / Medical Image Analysis 6 (2002) 431–447 447

H.L., Van Metter, R.L. (Eds.), Handbook of Medical Imaging. SPIE Vittitoe, N.F., Baker, J.A., Floyd, C.E., 1997. Fractal texture analysis inPress, Bellingham, WA, pp. 751–770. computer-aided diagnosis of solitary pulmonary nodules. Acad.

Muhm, J., Miller, R., Fontana, R., 1983. Lung cancer detected during a Radiol. 4, 96–101.screening program using four-month chest radiographs. Radiology Xu, X.W., Doi, K., 1995. Image feature analysis for computer-aided148, 609–615. diagnosis: accurate determination of ribcage boundary in chest radio-

Naidich, D., Zerhouni, E., Siegelman, S., 1984. Computed Tomography graphs. Med. Phys. 22, 617–626.of the Thorax. Raven Press, New York, pp. 171–172. Xu, X.W., Doi, K., Kobayashi, T., MacMahon, H., Giger, M.L., 1997.

Penedo, M.G., Carreira, M.J., Mosquera, A., Cabello, D., 1998. Com- Development of an improved CAD scheme for automated detection ofputer-aided diagnosis: a neural-network-based approach to lung nodule lung nodules in digital chest images. Med. Phys. 24, 1395–1403.detection. IEEE Trans. Med. Imag. 17, 872–880. Yoshida, H., Xu, X.W., Kobayashi, T., Giger, M.L., Doi, K., 1995.

Revesz, G., Kundel, H.L., Graber, M., 1974. The influence of structured Computer-aided diagnosis scheme for detecting pulmonary nodulesnoise on the detection of radiographic abnormalities. Invest. Radiol. 9, using wavelet transform. Proc. SPIE 2434, 621–626.479–486. Yoshida, H., Katsuragawa, S., Amit, Y., Doi, K., 1997. Wavelet snake for

Smith, M., 1993. Neural Networks for Statistical Modeling.Von Nostrand classification of nodules and false positives in digital chest radio-Reinhold, New York. graphs. In: Proceedings of the IEEE Engineering in Medicine and

Toriwaki, J., Suenaga, Y., Negoro, T., Fukumura, T., 1973. Pattern Biology Society, pp. 509–512.recognition of chest X-ray images. Comput. Grap. Image Process. 2,252–271.