effect of an artificial neural network on radiologists' performance in the differential...
TRANSCRIPT
AJR:172, May 1999 1311
Effect of an Artificial NeuralNetwork on Radiologists’Performance in the DifferentialDiagnosis of Interstitial LungDisease Using Chest Radiographs
Kazuto Ashizawa 1,2
Heber MaCMahon1
Takayuki Ishida
Katumi Nakamura1Carl J.Vyborny1
Shigehiko Katsuragawa1Kunio Ooi1
OBJECTIVE. We developed a new method to distinguish between various interstitial lung
diseases that uses an artificial neural network. This network is based on features extracted
from chest radiographs and clinical parameters. The aim of our study was to evaluate the ef-
feet of the output from the artificial neural network on radiologists� diagnostic accuracy.
MATERIALSAND METHODS. The artificial neural network was designed to differen-
tiate among I I interstitial lung diseases using 10 clinical parameters and 16 radiologic find-
ings. Thirty-three clinical cases (three cases for each lung disease) were selected. In the
observer test, chest radiographs were viewed by eight radiologists (four attending physicians
and four residents) with and without network output. which indicated the likelihood of each
of the I I possible diagnoses in each case. The radiologists’ performance in distinguishing
among the I I interstitial lung diseases was evaluated by receiver operating characteristic
(ROC) analysis with a continuous rating scale.
RESULTS. When chest radiographs were viewed in conjunction with network output. a sta-
tistically significant improvement in diagnostic accuracy was achieved (p < .(X)Ol ). The aver-
age area under the ROC curve was .826 without network output and .9 1 1 with network output.
CONCLUSION. An artificial neural network can provide a useful ‘second opinion” to as-
sist radiologists in the differential diagnosis of interstitial lung disease using chest radiographs.
Received September 9, 1998;accepted after revisionNovember 16,1998.
C. J. Vyborny, H. MacMahon, and K. Doi are shareholdersof R2 Technology, Inc., Los Altos, CA.
Supported by United States Public Health Service grantsCA24806 and CA62625. K. Ashizawa supported in part by aJapanese Board of Radiology grant and by the KonicaCompany, Tokyo, Japan.
1Kurt Rossmann Laboratories for Radiologic Image
Research, Department of Radiology, The University ofChicago, 5841 S. Maryland Ave., Chicago, IL 60637.Address correspondence to K. Doi.
2Present address: Department of Radiology, NagasakiUniversity School of Medicine, Sakamoto 1-7.1, Nagasaki852-8501, Japan.
AJR1999;172:131 1-1315
0361-803X/99/1725-131 1
© American Roentgen Ray Society
D ifferential diagnosis of interstitial
lung disease is a major subject in
chest radiology. Although CT has
greater diagnostic accuracy in the assessment
of interstitial lung disease. chest radiography
remains the imaging technique of choice for
initial detection and diagnosis. However. differ-
ential diagnosis of interstitial lung disease us-
ing chest radiographs has always been difficult
for radiologists because of the overlapping
spectrum of radiographic appearances and the
complexity of clinical parameters. Thus. one
must often merge many radiologic features and
clinical parameters to make a correct diagnosis.
Because of an ability to process large
amounts of information simultaneously. artifi-
cial neural networks may be useful in the dif-
ferential diagnosis of interstitial lung disease.
In fact. artificial neural networks have been
shown to be a powerful tool for pattern recog-
nition and data classification in medical imag-
ing [1-121. In previous studies [1. 2J. we
applied an artificial neural network to the dif-
ferential diagnosis of interstitial lung disease
and showed the network to perform well. How-
ever, we have not compared the performance of
the artificial neural network with radiologists�
performance without and with network output.
In this study. we used receiver operating char-
acteristic (ROC) analysis to test the effect of
network output on radiologists’ performance in
differentiating between certain interstitial lung
diseases using chest radiographs.
Materials and Methods
Artificial Neural Network Scheme
The artificial neural network scheme and its per-
fonnance for the differential diagnosis of interstitial
lung disease have been described in detail 121. A sin-
gle three-layer. feed-forward artificial neural network
with a back-propagation algorithm was used in thisstudy. We designed the artificial neural network to
distinguish among I 1 types ofinterstitial lung diseaseOil the basis of a given set of 26 clinical parameters
and radiologic findings. The artificial neural network
consisted of 26 input units tir 10 clinical parameters
and 16 radiologic findings. I I output units corre-
sp()nding tO the I I types of interstitial lung disease.
and I 8 hidden units. The 10 clinical parameters in-
eluded the patient’s age, sex. duration of symptoms.
seventy of symptoms. temperature. immune status.
underlying malignancy. history of smoking. dust cx-p()sure. and drug treatment. The 16 radiologic find-
Dow
nloa
ded
from
ww
w.a
jron
line.
org
by Q
ueen
slan
d U
niv
Of
Tec
h on
11/
21/1
4 fr
om I
P ad
dres
s 13
1.18
1.25
1.12
. Cop
yrig
ht A
RR
S. F
or p
erso
nal u
se o
nly;
all
righ
ts r
eser
ved
Ashizawa et al.
1312 AJR:172, May 1999
ings (Table 1 ) were classified into three categories.
namely. distribution of infiltrates (upper. middle. and
lower zones of the right and left lungs: proximal orperipheral predominance). characteristics of infil-trates (homogeneity: fineness or coarseness: nodular-
ity: degree of septal lines, honeycombing. and loss of
lung volume). and three additional thoracic abnor-
malities (lymphadenopathy. pleural effusions. heartsize). The interstitial lung diseases selected for differ-
ential diagnosis included sarcoidosis. miliary tuber-
�One Radiologist’s Ratlngsa
Radiologic FindingSubjective Rating
Case 1 Case 2
Infiltrate distribution
Right upper field 4 5
Right middle field 6 4
Right lower field 8 3
Left upper field 4 5
Leftmiddlefield 6 4
Leftlowerfield 8 3
Proximal/peripheral 8 5
Infiltrate characteristics
Homogeneous 7 8
Fine/coarse 4 8
Nodular 3 9
Septal lines 2 2
Honeycombing 8 0
Lossoflungvolume 2 0
Lymphadenopathy 0 7
Pleural effusion 0 0
Heart size 1 1
Note-The ratings ranged from 0 to 10, with the excep-tion of heart size, which ranged from 1 to 5. For proximal!peripheral, 10 = peripheral; for fine!coarse, 10 = coarse.
aFor 16 radiologic findings in a 64-year-old man with idio-pathic pulmonary fibrosis lease 11 and a 34-year-old womanwith sarcoidosis Icase 2).
culosis, lymphangitis carcinomatosa. interstitial lung
edema. silicosis. Pneumocvstis carinii pneumonia.scleroderma. eosinophilic granuloma. idiopathic pul-
monary fibrosis. viral pneumonia, and pulmonary
drug toxicity.
Our databases for the artificial neural networkincluded 150 actual clinical cases. I 10 published
cases. and I 10 hypothetical cases. Diagnoses of
actual clinical cases were based on a detailed clini-
cal correlation (ii = 55 l37�i-D or on pathologic (ii =
61 [40%]) or bacteriologic (a = 34 [23�/c1) proof ofthe pulmonary lesion. For clinical cases and pub-
lished cases. subjective ratings for the 16 radio-
logic findings were provided independently by
three experienced radiologists. Table I shows exam-
pIes of one radiologist’s ratings for two clinical cases
(Fig. I). Input data obtained from clinical parameters
and subjective ratings for radiologic findings werenormalized to a range from 0 to I . We used a modi-
fled round-robin (leave-one-out) method 181 to evalu-
ate the performance of the artificial neural network in
distinguishing the actual clinical cases. With this
method. although a round-robin method was applied
to all databases for training. only clinical cases were
used for testing. Output values ranging from 0 to I
indicated the likelihood of each of the 1 1 possible
diseases in each case.
The performance of the artificial neural networkwas evaluated using ROC analysis 1131. BinormalROC curves for the diagnosis of each disease wereestimated by use of the LABROC4 algorithm devel-
oped in our laboratories I 141. A: values representing
the area under each of the I I ROC curves were cal-culated. The average performance was estimated by
averaging the two binormal parameters of the 1 1 in-
dividual ROC curves [151. The average A: value as ameasure of overall performance was .947. The per-
formance of the artificial neural network was also
assessed by comparing output values indicating the
likelihood of each of the I I diseases in each case.
By the two largest outputs. both the sensitivity andthe specificity of the artificial neural network for in-dicating the correct diagnosis were 89%.
Case Selection
In the observer test. 33 actual clinical cases (three
cases per disease) were selected from a database of
1St) cases 121 by two experienced radiologists who
did not participate in the observer test. The 33 clini-
cal cases included 22 men and I I women who were
21-84 years old (mean. 5 I years). In each of the
cases, the disease had only one cause.
Although each case initially had three output val-
ues based on the three input values provided by three
radiologists. we averaged the output values for each
case and presented these averages to the observers
fir the observer test. ROC analysis of the average
OUtPUt values for the artificial neural network fIund
an A: value of .977 for the 33 cases. By the two larg-
est outputs. the sensitivity and specificity of the arti-ficial neural network for indicating the correct
diagnosis were 91% and 89%, respectively. This
level of performance of the artificial neural network.
obtained with a subset of our database. was similarto that obtained with all I 50 clinical cases.
Observer TestAn ROC observer test can be either of two
types: independent or sequential I 161. Ours was
sequential. Each chest radiograph and its clinical
parameters were shown to an observer, who rated
the likelihood of each of the I I diseases (without
network output). Subsequently. the network output
(Fig. 2 was presented to the same observer. who
rated the likelihood a second time (with network
output). The observer could either change the mi-tial ratings or leave them unchanged.
Eight radiologists (four experienced radiologists
lattending physiciansi and four radiology resi-
dents) who knew nothing about the cases partici-pated as observers. Betiwe the test. the observers
were told that only one of the I I possible diseases
was the correct diagnosis for each case; that the
reading condition was based on the sequential test:
that the role of the network output was to provide a
“second opinion’: and that. by the two largest out-
puts. the sensitivity and specificity of the artificial
Fig. 1.-Examples of chest radio-graphs used in this study.A, 64-year-old man with idiopathicpulmonaryfibrosis (case 1 in Table 1).B, 34-year-old woman with sarcoi-dosis (case 2 in Table 2).
Dow
nloa
ded
from
ww
w.a
jron
line.
org
by Q
ueen
slan
d U
niv
Of
Tec
h on
11/
21/1
4 fr
om I
P ad
dres
s 13
1.18
1.25
1.12
. Cop
yrig
ht A
RR
S. F
or p
erso
nal u
se o
nly;
all
righ
ts r
eser
ved
-IL..IIP
0.8 1.0
0
U
55
a).�U)00.
I� 0.2
I C
0
� U
. 0)
U)
0.
..�s for ROC Curves
agnostic Accuracy11th and Without ANN�utput
0.2
False-positive fraction False-positive fraction
Fig. 4.-Average receiver operating characteristiccurves for attending physicians without and with arti-ficial neural network output. Observer performance
with network output was significantly improved. ANN =
artificial neural network.
Effect of an Artificial Neural Network on Diagnosis of Lung Disease
AJR:172, May 1999 1313
neural network for indicating the correct diagnosiswere 9lCk and 89C/c. respectively. Before the test.four training cases were shown to the observers to
familiarize them with the rating method and withuse of network output as a second opinion.
The observers’ confidence about the likelihood of
each of the I I possible diseases was represented
using an analog continuous-rating scale with a line-
checking method I 161. For the initial ratings. theobservers used a graphite pencil to mark their con-
fidence levels along a 7-cm-long line. Ratings
of “definitely absent’ and “definitely present”
were marked above the left and the right ends.
Sarco�dos�s
MiIia,y bc.
Lymphwgftc Ca.
Edema
SAcosa
ScI#{149}�odeona
Pcp
EG
5#{176}F
Voal pneumorua
Drug tox.cay
0.0 0.2 0.4 0.6
ANN output
Fig. 2.-Graph shows one example of artificial neuralnetwork output presented to eight observers. Output
values were obtained for 64-year.old man in Figure 1A.
Largest output value among 11 diseases correspondsto correct diagnosis. tbc = tuberculosis, ca. = carcino-matosa, PCP = Pneumocystis cariniipneumonia, EG =
eosinophilic granuloma. IPF = idiopathic pulmonary fi-brosis, ANN = artificial neural network.
Fig. 3.-Comparison of average receiver operatingcharacteristic curves for all observers without and with
artificial neural network output and receiver operatingcharacteristic curve for artificial neural network output
alone. Observer performance with network output wassignificantly higher than that without network outputbut was still lower than performance of network output
alone. ANN = artificial neural network.
respectively. of the line. For second ratings that were
different from the initial ratings. observers used a red
pencil to mark their confidence levels along the
same line. For data analysis. the confidence level
was scored by measuring the distance from the left
end of the line to the marked point and converting
the measurement to a scale from 0 to 100.
Data Analysis
The radiologists diagnostic performance with
and without network output was evaluated using
ROC analysis [ 131. We defined confidence ratingsdata with the correct diagnosis as “actual positives”and those with any other diseases as “actual nega-
tives.�’ For each observer and each reading condition
(with and without network output). we used a maxi-
mum likelihood estimation to fit a binormal ROC
curve to the confidence ratings data for all 1 1 possi-
ble diseases in the 33 cases [14]. This combining of
data for all diseases was done because of the small
number of cases of each disease. The A: value was
then calculated for each fitted curve. The statistical
significance of differences between ROC curves foreach reading condition was determined by applying
a two-tailed t test for paired data to the reader-spe-
cific A: values. We also analyzed the statistical sig-nificance of differences between ROC curves tir
attending physicians and those for radiology resi-
dents using a two-tailed t test. To represent the over-
all performance for each group of observers. average
ROC curves were generated for the ftur residents.the four attending physicians. and all radiologists by
averaging the two binormal parameters of their mdi-
vidual ROC curves I I 51.Another indication of performance was the
number of correctly diagnosed cases for which the
observer’s ranking changed because of network
output. Four rankings were used-I. 2. 3. and less
than 3-with I corresponding to a case that the ob-
server diagnosed correctly with the highest confi-dence rating. 2 corresponding to a case diagnosed
correctly with the second highest confidence rating.
and so on. An improvement in a ranking. such as a
change from 2 to I , indicated that network output
benefited diagnostic performance: the opposite mdi-
cated a detrimental efl#{232}ct.Using a two-tailed t test
for paired data, we analyzed the statistical signifi-
cance of the difference between the number of cases
benefited and the number not benefited. The same
test was used tO analyze differences between thenumber of attending physicians’ cases affected and
the number of residents’ cases affected.
Results
The overall performance by the three groups
of observers is illustrated by the average ROC
curves in Figures 3-5. Diagnostic performance
improved when chest mdiographs and clinical
parameters were shown in conjunction with
network output. The average A: value for all
radiologists increased to a statistically signifi-
cant degree. from .826 without network output
to .91 1 with network output (p < .0001). How-
ever, for all radiologists, the average A: value
with network output was still lower than for
the network alone (A: .977). For the four at-
tending physicians, the average A: values
without and with network output were .839
and .905. respectively. whereas for the four
residents. the average A: values without and
with network output were .812 and .916. re-
spectively. These differences were also statisti-
cally significant (/) = .0026 and p = .0074.
respectively). Table 2 shows the A: values
without and with network output for each ra-
Dow
nloa
ded
from
ww
w.a
jron
line.
org
by Q
ueen
slan
d U
niv
Of
Tec
h on
11/
21/1
4 fr
om I
P ad
dres
s 13
1.18
1.25
1.12
. Cop
yrig
ht A
RR
S. F
or p
erso
nal u
se o
nly;
all
righ
ts r
eser
ved
�1umberofCases (of 33otal) for Which Ranking
as AffeCted by Artificial
�iraI Network Output
0.2 0.4
False-positive fraction
A B C D
Observers
a
Ashizawa et al.
1314 AJR:172, May 1999
diologist. The performance of all improved
when network output was used.
When we compared the average perfor-
mance of the four attending physicians with
that of the four residents, the difference in
A: values failed to reach statistical signifi-
cance for each of the two reading conditions
(without P .3601 and with [p = .5651 net-work output). Although the gain in perfor-
mance with use of network output was
larger fir residents than for attending physi-
cians. the difference was not statistically
significant ( p = .077).
Table 3 shows the number of cases affected
by network output for the individual radiolo-
gists. The net effect of network output was
clearly an improvement in performance. The
average number of cases affected beneficially
and detrimentally by network output for all ra-
diologists was 9.8 and I .4. respectively. and
this dift’erence was statistically significant (p =
.0002). When the network ranking was 1, all
cases affected by network output showed a
beneficial effect for all radiologists (Fig. 6).
The difference between the average number of
cases affected beneficially and detrimentally
by network output was also statistically signif-
icant ( p = .0034 and p = .0002. respectively)
for the attending-physician group (6.6 and I .0.
respectively) and the resident group (13.0 and
I .8. respectively).
In a comparison of the average number of
cases benefited by network output for the four at-
tending physicians and for the four residents, the
difference was statistically significant (p = .0001).
whereas the difference in the detrimental effect
was not statistically significant (p = .414) (Fig. 6).
Performance improved significantly more for the
residents than for the attending physicians.
z4
.0
00
Cl)a)Cl)
U
E
z
-�� �-� � �---�
L� -�-� � � �-�-�-�-
Fig. 5.-Average receiver operating characteristic
curves for radiology residents without and with artifi-cial neural network output. Observer performance withnetwork output was significantly improved. ANN = arti-ficial neural network.
Fig. 6.-Graph shows number of correctly diagnosed cases for which observer’s ranking (1 or 2) changed be-cause of network output. Network output clearly improved performance of all observers. ANN = artificial neuralnetwork, white bars = ranking of 1, gray bars = ranking of 2.
Discussion
The results of �ur observer test indicate thatnetwork output can significantly improve mdi-
ologists’ performance in the differential diag-
nosis of interstitial lung disease using chest
radiographs. Three radiologists (one attending
physician and two residents) participated in a
pilot observer test before this study. Another set
of 33 clinical cases. which do not overlap the
cases used in this study. was selected from our
database and used for the pilot test. The average
performance of the three radiologists was sig-
nificantly greater with network output (A =
.930) than without it (A: .867) (p < .05). The
pilot results support those of the main study.
The performance of the artificial neural net-
work (A: 977) was substantially better than
the radiologists’ performance (A: .826), and
the difference was statistically significant (p <
.000 1 ). This finding can be interpreted as fol-
lows. The differential diagnosis of interstitial
lung disease using chest radiographs is gener-
ally considered to require extraction of image
features and subsequent merging of extracted
features and clinical parameters. However, the
features used by one radiologist may differ
from those used by another and may vary from
case to case. The approach tends to be less
than systematic and influenced by anecdotal
experience. Therefore. the 16 image features
used by the network would likely not be iden-
tical to those used by the radiologists but,
probably. would be more comprehensive. Thus,
the network’s information would consistently
be more complete than the radiologists’ . In ad-
dition. the network would likely be better able
than the radiologists to combine features. To
verify this assumption rigorously, however,
one would need to compare the ability of the
network and the radiologists to merge the
same features. In practice. performance differ-
ences between the network and the radiolo-
gists may have resulted from differences in
both the extraction and the merging.
Dow
nloa
ded
from
ww
w.a
jron
line.
org
by Q
ueen
slan
d U
niv
Of
Tec
h on
11/
21/1
4 fr
om I
P ad
dres
s 13
1.18
1.25
1.12
. Cop
yrig
ht A
RR
S. F
or p
erso
nal u
se o
nly;
all
righ
ts r
eser
ved
Effect of an Artificial Neural Network on Diagnosis of Lung Disease
AJR:172, May 1999 1315
Studies on the detection of abnormalities such
as lung nodules on chest radiographs and micro-
calcifications on mammograms have shown that
the computer helped radiologists detect findings
even if its performance was lower than that of the
radiologists 116, 17]. This result was probably
possible because the computer output alerted ra-
diologists to the locations of some lesions that
might be missed by radiologists. When the
computer scheme indicates whether a lesion
exists by using an arrow in a detection task, it
is relatively easy for radiologists to decide ei-
ther to agree with or to disregard the com-
puter output. In other words, radiologists
would be able to correct their mistakes if they
recognize that, for example, they might have
overlooked an obvious lesion. Our study, on
the other hand, found that the performance of
radiologists with network output was lower
than the performance of the artificial neural
network alone. Because the performance of
the network was much higher than that of the
radiologists alone, one might expect that the
performance of radiologists with networkoutput would be at least equal to that of the
network. However, the radiologists could not
take full advantage of network output be-
cause they were not familiar with using it.
Network output indicates the likelihood of each
of the 1 1 possible diseases-information that
radiologists may find difficult to assimilate. In
addition, although the radiologists knew that the
network performed well, they might not have
had confidence in all of its output, some of
which may have disagreed with their own
knowledge and experience. To gain confi-dence in network output and familiarity with
using it, radiologists need to apply it prospec-
tively to differential diagnosis in actual clini-
cal situations.
The performance of one of the auending
physicians without and with network output
was lower than that of the other attending phy-
sicians, possibly because for almost all the
cases he did not use the clinical parameters
when interpreting chest radiographs. Therefore,
even for experienced observers, consideration
of these parameters would seem important. His
low performance probably led to the compara-
bility between the average A. values of the at-
tending-physician group and those of the
resident group. However, the improvement in
his performance with network output was simi-
lar to that of the other attending physicians, so
the comparison of gains for the two groups
might be meaningful. In fact, the performance
gain for residents, as measured by both A.
value and number of cases benefited, was larger
than that for attending physicians. This finding
indicates that network output can improve the
performance of radiologists, especially thoseless experienced. We believe further study is
needed to determine the true difference in per-
fomiance between the two groups of observers
under the two reading conditions.
Independent ROC observer tests, which
have been used widely, have two sessions. In
the first, each observer interprets half the cases
with network output and the other half with-
out. In the second, the first half is interpreted
without and the second half with network out-
put. The general assumption is that observers
do not remember the cases interpreted in the
first session. This assumption is correct for
most simple tests, such as the detection of lung
nodules, but questionable for our study becausethe task required was differential diagnosis,
which is more complicated than detection. Thus,
radiologists would likely remember details of
some cases, especially first-session cases with
network output.
Unlike independent tests, a sequential test,
which we used, measures in one session the ef-
feet of network output on diagnostic decisions.
Concern about variations in the observer’s
memory is thus eliminated. The sequential test
is not, however, an established method for
ROC observer studies and is inherently biased
because the observer is always first tested
without network output. However, one study
[16] indicated that these two types of observer
tests reach similar conclusions. Therefore, we
believe that the sequential test can be used to
evaluate the effect of computer output, espe-
cially for differential diagnosis.
In conclusion, our results indicate that artifi-
cial neural network output can significantly
improve the accuracy of radiologists in the dif-
ferential diagnosis of interstitial lung disease
using chest radiographs. We believe that net-
work output, when used as a second opinion,
can help radiologists with decision making.
Acknowledgments
We thank Hajime Nakata (University of Oc-
cupational and Environmental Health, School
of Medicine, Fukuoka, Japan) and Kuniaki Ha-
yashi (Nagasaki University School of Medicine,
Nagasaki, Japan) for supplying valuable clinical
cases. We also thank John J. Fennessy, Lau-
rence Monnier, Walter Cannon, Thomas Woo,
Scott Stacy, Bmce Lin, Dixson Gilbert, and
Shawn Kenney for participating as observers;
Charles E. Metz for useful suggestions and dis-
cussions about receiver operating characteristic
analysis; Hiroyuki Yoshida and Yulei Jiang for
helpful discussions; and E. Lanzl for improving
the manuscript.
References
I. Asada N, Doi K. MacMahon H. et al. Potential
usefulness of an artificial neural network for dif-
ferential diagnosis of interstitial lung disease: pi-
lot study. Radiology 1990:177:857-860
2. Ashizawa K. Ishida T. MacMahon H. Vyborny
ci. Katsuragawa S. Doi K. Artificial neural net-
works in chest radiography: application to the dif-
ferential diagnosis of interstitial lung disease.
Acad Radiol 1999:6:2-9
3. Ishida T, Katsuragawa S. Ashizawa K. MacMahon
H. Doi K. Artificial neural networks in chest radio-
graphs: detection and characterization of interstitial
lung disease. Prnc SPIE 1997:3034:931-937
4. Gross GW. Boone JM. Greco-Hunt V. Greenberg
B. Neural networks in radiologic diagnosis. II. In-
terpretation of neonatal chest radiographs. invest
Radio! 1990:25:1017-1023
5. La SC. Freedman MT. Lin iS. Mon 5K. Automaticlung nodule detection using profile matching andback-propagation neural network techniques.
J Digit imaging 1993:6:48-54
6. Gurney JW, Swensen Si. Solitary pulmonary
nodules: determining the likelihood of malig-
nancy with neural network analysis. Radiology
1995:196:823-829
7. Wu Y, Doi K. Giger ML, Nishikawa RM. Comput-
erized detection of clustered microcalcifications in
digital mammograms: application of artificial neu-
ml networks. Med Phvs 1992:19:555-560
8. Wu Y, Giger ML. Doi K, Vybomy Ci. Schmidt RA.Mets CE. Artificial neural networks in mammogra-phy: application to decision making in the diagno-
sis ofbreast cancer. Radiology 1993;187:81-87
9. Heitmann KR. Kauczor H. Mildenberger P. et al.
Automatic detection of ground glass opacities onlung HRCT using multiple neural networks. Eur
Radio! 1997:7:1463-1472
10. Henschke CI. Yankelevitz DF. Mateescu I. et al.
Neural networks for the analysis of small pulmo-
nary nodules. Chit imaging 1997:21:390-399
I I. Bocchi L. Coppini G. De Dominicis R. et al. Tis-
sue characterization from X-ray images. Med Eng
Pin’s 1997; 19:336-342
12. Lin iS. Hasegawa A. Freedman MT. et al. Differ-
entiation between nodules and end-on vessels us-
ing a convolution neural network architecture. J
Digit imaging 1995:8:132-14113. Metz CE. ROC methodology in radiologic imag-
ing. InvestRadiol 1986:21:720-733
14. Metz CE, Herman BA, Shen JH. Maximum-like-
lihood estimation of receiver operating (ROC)
curves from continuously-distributed data. Stat
Med 1998:17:1033-1053
15. Mets CE. Some practical issues of experimental
design and data analysis in radiological ROC
studies. invest Radio! 1989:24:234-245
16. Kobayashi T. Xu XW, MacMahon H, Metz CE,
Doi K. Effect of a computer-aided diagnosis
scheme on radiologists’ performance in detection
of lung nodules on radiographs. Radiology 1996:
199:843-848
17. Chan H-P. Doi K. Vyborny Ci. et al. Improvementin radiologists’ detection of clustered microcalcifi-
cations on mammograms: the potential of com-
puter-aided diagnosis. invest Radio! 1990:25:
1102-1110
Dow
nloa
ded
from
ww
w.a
jron
line.
org
by Q
ueen
slan
d U
niv
Of
Tec
h on
11/
21/1
4 fr
om I
P ad
dres
s 13
1.18
1.25
1.12
. Cop
yrig
ht A
RR
S. F
or p
erso
nal u
se o
nly;
all
righ
ts r
eser
ved