artificial neural networks accurately predict mortality in patients with nonvariceal upper gi...

11
ORIGINAL ARTICLE: Clinical Endoscopy Artificial neural networks accurately predict mortality in patients with nonvariceal upper GI bleeding Gianluca Rotondano, MD, FASGE, Livio Cipolletta, MD, Enzo Grossi, MD, Maurizio Koch, MD, Marco Intraligi, MD, Massimo Buscema, PhD, Riccardo Marmo, MD, for the Italian Registry on Upper Gastrointestinal Bleeding (Progetto Nazionale Emorragie Digestive) Rome, Italy Background: Risk stratification systems that accurately identify patients with a high risk for bleeding through the use of clinical predictors of mortality before endoscopic examination are needed. Computerized (artificial) neural networks (ANNs) are adaptive tools that may improve prognostication. Objective: To assess the capability of an ANN to predict mortality in patients with nonvariceal upper GI bleeding and compare the predictive performance of the ANN with that of the Rockall score. Design: Prospective, multicenter study. Setting: Academic and community hospitals. Patients: This study involved 2380 patients with nonvariceal upper GI bleeding. Intervention: Upper GI endoscopy. Main Outcome Measurements: The primary outcome variable was 30-day mortality, defined as any death occurring within 30 days of the index bleeding episode. Other outcome variables were recurrent bleeding and need for surgery. Results: We performed analysis of certified outcomes of 2380 patients with nonvariceal upper GI bleeding. The Rockall score was compared with a supervised ANN (TWIST system, Semeion), adopting the same result validation protocol with random allocation of the sample in training and testing subsets and subsequent crossover. Overall, death occurred in 112 cases (4.70%). Of 68 pre-endoscopic input variables, 17 were selected and used by the ANN versus 16 included in the Rockall score. The sensitivity of the ANN-based model was 83.8% (76.7-90.8) versus 71.4% (62.8-80.0) for the Rockall score. Specificity was 97.5 (96.8-98.2) and 52.0 (49.8 4.2), respectively. Accuracy was 96.8% (96.0-97.5) versus 52.9% (50.8-55.0) (P .001). The predictive performance of the ANN-based model for prediction of mortality was significantly superior to that of the complete Rockall score (area under the curve 0.95 [0.92-0.98] vs 0.67 [0.65-0.69]; P .001). Limitations: External validation on a subsequent independent population is needed, patients with variceal bleeding and obscure GI hemorrhage are excluded. Conclusion: In patients with nonvariceal upper GI bleeding, ANNs are significantly superior to the Rockall score in predicting the risk of death. ( Gastrointest Endosc 2011;73:218-26.) Abbreviations: ANN, artificial neural network; CI, confidence interval; IS, input selection; PNED, Progetto Nazionale Emorragie Digestive; T&T, training and testing; UGIH, upper GI hemorrhage. DISCLOSURE: All authors disclosed no financial relationships relevant to this publication. Copyright © 2011 by the American Society for Gastrointestinal Endoscopy 0016-5107/$36.00 doi:10.1016/j.gie.2010.10.006 Received July 30, 2010. Accepted October 5, 2010. Current affiliations: Division of Gastroenterology (G.R., L.C.), Ospedale Maresca, Torre del Greco, Division of Gastroenterology (R.M.), Ospedale Curto, Polla, Division of Gastroenterology (M.K.), Azienda Ospedaliera San Filippo Neri, Rome, Medical Department (E.G.), Bracco S.p.A., Milan, Semeion Research Centre for Sciences of Communication (M.I., M.B.), Rome, Italy. Poster presented in part at Digestive Disease Week, May 30-June 4, 2009, Chicago, Illinois (Gastrointest Endosc 2009;69:AB308). Reprint requests: Gianluca Rotondano, MD, FASGE, Via Cappella Vecchia 8, 80121 Naples, Italy. If you would like to chat with an author of this article, you may contact Dr Rotondano at [email protected]. 218 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011 www.giejournal.org

Upload: independent

Post on 14-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

t

C0d

R

CM

ORIGINAL ARTICLE: Clinical Endoscopy

Artificial neural networks accurately predict mortality in patients withnonvariceal upper GI bleeding

Gianluca Rotondano, MD, FASGE, Livio Cipolletta, MD, Enzo Grossi, MD, Maurizio Koch, MD,Marco Intraligi, MD, Massimo Buscema, PhD, Riccardo Marmo, MD, for the Italian Registry on UpperGastrointestinal Bleeding (Progetto Nazionale Emorragie Digestive)

Rome, Italy

Background: Risk stratification systems that accurately identify patients with a high risk for bleeding through theuse of clinical predictors of mortality before endoscopic examination are needed. Computerized (artificial) neuralnetworks (ANNs) are adaptive tools that may improve prognostication.

Objective: To assess the capability of an ANN to predict mortality in patients with nonvariceal upper GI bleedingand compare the predictive performance of the ANN with that of the Rockall score.

Design: Prospective, multicenter study.

Setting: Academic and community hospitals.

Patients: This study involved 2380 patients with nonvariceal upper GI bleeding.

Intervention: Upper GI endoscopy.

Main Outcome Measurements: The primary outcome variable was 30-day mortality, defined as any deathoccurring within 30 days of the index bleeding episode. Other outcome variables were recurrent bleeding andneed for surgery.

Results: We performed analysis of certified outcomes of 2380 patients with nonvariceal upper GI bleeding. TheRockall score was compared with a supervised ANN (TWIST system, Semeion), adopting the same resultvalidation protocol with random allocation of the sample in training and testing subsets and subsequentcrossover. Overall, death occurred in 112 cases (4.70%). Of 68 pre-endoscopic input variables, 17 were selectedand used by the ANN versus 16 included in the Rockall score. The sensitivity of the ANN-based model was 83.8%(76.7-90.8) versus 71.4% (62.8-80.0) for the Rockall score. Specificity was 97.5 (96.8-98.2) and 52.0 (49.8 4.2),respectively. Accuracy was 96.8% (96.0-97.5) versus 52.9% (50.8-55.0) (P � .001). The predictive performance ofthe ANN-based model for prediction of mortality was significantly superior to that of the complete Rockall score(area under the curve 0.95 [0.92-0.98] vs 0.67 [0.65-0.69]; P � .001).

Limitations: External validation on a subsequent independent population is needed, patients with varicealbleeding and obscure GI hemorrhage are excluded.

Conclusion: In patients with nonvariceal upper GI bleeding, ANNs are significantly superior to the Rockall scorein predicting the risk of death. (Gastrointest Endosc 2011;73:218-26.)

Abbreviations: ANN, artificial neural network; CI, confidence interval; IS,input selection; PNED, Progetto Nazionale Emorragie Digestive; T&T,training and testing; UGIH, upper GI hemorrhage.

DISCLOSURE: All authors disclosed no financial relationships relevant tohis publication.

opyright © 2011 by the American Society for Gastrointestinal Endoscopy016-5107/$36.00oi:10.1016/j.gie.2010.10.006

eceived July 30, 2010. Accepted October 5, 2010.

urrent affiliations: Division of Gastroenterology (G.R., L.C.), Ospedale

Curto, Polla, Division of Gastroenterology (M.K.), Azienda Ospedaliera SanFilippo Neri, Rome, Medical Department (E.G.), Bracco S.p.A., Milan,Semeion Research Centre for Sciences of Communication (M.I., M.B.),Rome, Italy.

Poster presented in part at Digestive Disease Week, May 30-June 4, 2009,Chicago, Illinois (Gastrointest Endosc 2009;69:AB308).

Reprint requests: Gianluca Rotondano, MD, FASGE, Via Cappella Vecchia 8,80121 Naples, Italy.

If you would like to chat with an author of this article, you may contact Dr

aresca, Torre del Greco, Division of Gastroenterology (R.M.), OspedaleRotondano at [email protected].

218 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011 www.giejournal.org

tsrilh

pipe

tot

t

sdhhhaheooUU2Eilcmo

dbbibefo3(twbfoherbog

Rotondano et al Neural networks predict mortality in nonvariceal bleeding

Acute upper GI hemorrhage (UGIH) remains a com-mon reason for hospital admission.1-3 Early risk stratifica-ion of patients with bleeding by using clinical and endo-copic criteria is recommended. Those at high risk forecurrent bleeding and death, who need aggressive med-cal and endoscopic treatment, must be distinguished fromow-risk patients who can be discharged safely from theospital or managed as outpatients.4-8 Because it may not

be practical to perform early endoscopy in every patientadmitted for UGIH and because mortality is clinically rel-evant in the first 12 to 24 hours of the onset of hemor-rhage,9 it is important to have a risk stratification systemthat accurately identifies high-risk patients before endo-scopic examination through the use of clinical predictorsof mortality.

Multiple death-predicting models have been developedwith conventional statistical procedures,9-12 but their ap-lication at the individual level is hampered by the high

nterdependence of the clinical variables involved, whichotentially may interact with each other with reciprocalnhancement.

Artificial neural networks (ANNs) are computerized sys-ems that use nonlinear statistical analysis to reveal previ-usly unrecognized and/or weak relationships betweenhe input and output variables.13,14 These multilayeredconnectionist models are adaptive; that is, they dynami-cally find the connections between various datasets. ANNsalso have the capability of generalization: when an ANNhas been trained with proper data for determining therules that better describe a certain phenomenon, it alsocan make some correct generalizations responding cor-rectly to data it has not yet processed. In other words, aftera learning period, the ANN can predict the output oninputs of further unknown cases. Published studies haveshown that ANNs can accurately predict patient outcomesin acute lower and UGIH.15,16

The primary aim of this study was to validate the use ofa nonendoscopic, ANN-based model in predicting mortal-ity in patients with acute nonvariceal UGIH. The second-ary aim was to compare the predictive ability of ANNs withthat of the Rockall scoring system.10

METHODS

The study was approved by the institutional reviewboards of all participating centers. A project-specific re-search database was developed, which collected data from23 participating sites across Italy, constituting the ProgettoNazionale Emorragie Digestive (PNED) study group,which ran 2 prospective studies in 2003 to 20049 and 2007o 2008.11 In these studies, a traditional modeling methodof univariate and multivariate analysis was conducted toidentify predictors of mortality in patients with nonvaricealupper GI bleeding.

Patients were considered for the studies if they had

clinical evidence of overt UGIH on admission (hemateme- o

www.giejournal.org V

is or melena or dark, tarry materials on rectal examinationocumented and witnessed by nursing or medical staff); aistory of hematemesis/coffee-ground vomiting, melena,ematochezia or a combination of any of these within 24ours preceding the admission; or clinical evidence ofcute UGIH while hospitalized for any other reason (in-ospital bleeding), independently of their age. Criteria ofxclusion were age less than 18 years, primary diagnosisther than acute UGIH, chronic anemia, variceal bleeding,bscure bleeding, transfer from another institution, orGIH that occurred more than 3 days before presentation.pper GI endoscopy was systematically performed within4 hours to confirm the source of bleeding in all cases.ndoscopic therapy of high-risk ulcer stigmata includednjection of diluted epinephrine, thermal coaptive coagu-ation or argon plasma coagulation, application of hemo-lips, or a combination thereof. The choice of the specificodality of endotherapy was left at the discretion of theperator according to the individual site’s protocol.The primary outcome variable was 30-day mortality,

efined as any death occurring within 30 days of the indexleeding episode. Other outcome variables were recurrentleeding and need for surgery. Such outcomes were mon-tored from admission to the hospital, or the onset ofleeding for in-hospital patients, up to 30 days after thendoscopic examination. To ensure the completeness ofollow-up information, the study nurses called all patientsr their families or the referring primary care physician at0 days. Continued or persistent bleeding was defined as1) failure to control arterial bleeding endoscopically; (2)he presence of a bloody nasogastric aspirate; (3) shock,ith a pulse greater than 100 beats/minute, a systoliclood pressure �100 mm Hg, or both; and/or (4) the needor substantial replacement of blood or fluid volume (needf more than 4 units of blood in 6 hours). Persistentemorrhage was usually considered an indication formergency surgery or percutaneous embolization. Recur-ent bleeding was defined by recurrent vomiting of freshlood, melena, or both, with one of the following: devel-pment of hemodynamic instability, a decrease in hemo-lobin concentration of at least 2 g/dL, or a decrease of 5%

Take-home Message

● Neural networks are adaptive models that take intoaccount nonlinear interactions among variables.

● In patients with nonvariceal upper GI bleeding, anonendoscopic artificial neural network (ANN)– basedmodel was accurate in predicting the risk of death. Themodel was significantly superior to the Rockall score.

● ANNs are promising tools for improving prognostication.Future studies will need to evaluate whether ANNs can beused as a decision aid to the clinician.

r more in hematocrit after initial successful treatment and

olume 73, No. 2 : 2011 GASTROINTESTINAL ENDOSCOPY 219

pbSoem

coc

aa

smdtrspitbtdiDdisi(wctuoaS

nsw

uper

Neural networks predict mortality in nonvariceal bleeding Rotondano et al

after a period of clinical stabilization of 24 hours. Rebleed-ing had to be confirmed by a second endoscopy.

Preprocessing methods and experimentalprotocols (supervised ANN)

In the present study, the null hypothesis was that theANN-based model had a discriminant capability similar tothat of the Rockall scoring system10 in predicting death inatients with nonvariceal UGIH. ANN models were builty using a commercial neural network program (TWIST,emeion Software, Rome, Italy). ANN models were testedn the same dataset used for logistic regression analysis inarlier publications, using only the pre-endoscopic clinicaleasures of the PNED study9,11 as input variables and a

single output variable (30-day mortality).ANNs were trained on an initial subset of patients with

a 50% a priori probability of the output variable (mortal-ity), artificially inserting an equal number of patients de-ceased and survived, in order for the system to “learn” torecognize the variables that better separate cases (death)from controls (survivors). For a complete description ofthe data selection process, see the Appendix (availableonline at www.giejournal.org).

Data preprocessing was performed by using 2 differentresampling criteria of the global dataset.

Random criterion. Once the learning phase on thewhole dataset was performed, the study sample was5-times randomly divided into 2 subsamples, always dif-ferent but containing a similar distribution of cases andcontrols: the training one (containing the dependent vari-able) and the testing one (5 � 2 cross-validation proto-ol).17 During the training phase, the ANN learned a modelf data distribution and then, on the basis of such a model,

Figure 1. Validation protocol for the s

lassified patients in the testing set in a blind way. Training o

220 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011

nd testing sets were then reversed, and consequently 10nalyses for every model used were conducted.

Optimized criterion: TWIST system. The TWISTystem is an evolutionary algorithm able to optimize si-ultaneously both variables selection and training-testingistribution. It is an ensemble of 2 algorithms: training andesting (T&T) and input selection.17 The T&T system is aobust data resampling technique able to arrange theource sample into subsamples that all have a similarrobability density function. In this way, the data are splitnto 2 or more subsamples so that the ANN models can berained, tested, and validated more effectively. The T&T isased on a population of n ANNs managed by an evolu-ionary system. In its simplest form, this algorithm repro-uces several distribution models of the complete datasetn 2 subsets (A, the training set and B, the testing set).uring the learning process, according to its own dataistribution model, each ANN is trained in a blind andndependent manner in 2 directions: training with sub-ample A and blind testing with subsample B versus train-ng with subsample B and blind testing with subsample AFig. 1). The input selection system is an evolutionaryrapper system able to reduce the amount of data whileonserving the largest amount of information available inhe dataset. The combined action of these 2 systems allowss to solve 2 common problems in managing ANN, that is,verfitting or overlearning.18 Both evolutionary systemsre based on the Genetic Doping Algorithm developed atemeion Research Centre.17

After this processing, the features that were most sig-ificant for the classification were selected and, at theame time, the training set and the testing set were createdith a function of probability distribution similar to the

vised TWIST system (Semeion, Italy).

ne that provided the best results in the classification. A

www.giejournal.org

acso

thw4

sic[

sitgosswvf

D

orRwaalcmtprha

tt(tdnaaponcssaonvcm

Rotondano et al Neural networks predict mortality in nonvariceal bleeding

supervised multi-layer perceptron with back propagation,with 4 hidden units, was then used for the classificationtask.

Data analysisWe calculated accuracy, sensitivity (correct identifica-

tion of cases, ie, patients deceased), specificity (correctidentification of controls, ie, survivors), positive and neg-ative predictive values, and likelihood ratios for positiveand negative tests for both the ANN and the Rockallscore.10 The �2 test was used to compare categorical vari-bles. Furthermore, areas under the receiver operatingharacteristic curves for the ANN and the complete Rockallcore (clinical and endoscopic) were calculated for theutcome variable.19,20

RESULTS

Certified outcomes of 2380 patients with nonvaricealUGIH (1543 men [64.8%], mean age � SD 68.0 � 16.3years) were analyzed. Demographic and clinical featuresof patients are shown in Table 1.

A recent history of medication use was recorded in ahigh proportion of cases, mostly nonsteroidal anti-inflammatory drugs and low-dose aspirin. One or morecomorbidities were recorded at the time of presentation inover 60% of cases, with 25% having more than 2 comor-bidities. The median number of comorbid conditions perpatient was 1.0 (interquartile range 1.0-2.0). High-risk en-doscopic signs were present in almost 40% of these pa-tients, whereas endoscopic therapy was delivered in 42%of cases. Mean (� SD) length of stay was 7.2 (� 6.3) days(median: 4.0, interquartile range 2.0-9.0). Recurrent bleed-ing occurred in 123 instances (5.16%; range 3.93%-6.88%),need for surgery was recorded in 46 cases (1.93%; range0.91%-2.76%), and a total of 112 patients (4.70%; range3.54%-5.75%) died during hospitalization or at 30 days.The median time to death was 4 days (range 2-6 days). Themean (� SD) age of the patients who died was 76.6 (�14.0) years.

Testing for Helicobacter pylori was performed duringhe hospital stay in 42% of patients (all by means ofistologic evaluation, as per protocol). H pylori infectionas detected in 44.3% (95% confidence interval [CI] 41.3-7.4) of tested patients.

Classification performances of ANN andRockall score10

Table 2 summarizes the input variables used for mod-eling. The advanced ANN model, through the final selec-tion of a subgroup of 17 variables, which resulted in themost often selected among 11 independent applications(at least 10 times), provided the highest predictive perfor-mance, with a sensitivity of 83.8% (95% CI, 76.7-90.8), aspecificity of 97.5% (95% CI, 96.8-98.2), and an overall

accuracy of 96.8% (95% CI, 96.0-97.5) (Table 3). The re- s

www.giejournal.org V

ulting receiver operating characteristic curve is reportedn Figure 2, with a comparison with the receiver operatingharacteristic curve obtained with the Rockall score (0.950.92-0.98] vs 0.67 [0.65-0.69]; P � .001).

The following variables were selected by the TWISTystem: age; use of calcium-antagonists or heparin; Amer-can Society of Anesthesiologists Physical Status Classifica-ion System class; the presence of comorbid diseases (con-estive heart failure, cerebrovascular disease, chronicbstructive pulmonary disease, diabetes, chronic renal in-ufficiency, liver cirrhosis, and cancer); the presence ofhock, hypotension, or tachycardia; a clinical presentationith bright red blood hematemesis; initial laboratory testalues such as lowest hemoglobin level; and time elapsedrom symptoms to hospital admission.

ISCUSSION

Risk assessment is recommended for early stratificationf patients with upper GI bleeding into low-risk and high-isk categories.4,7,21,22 Death-prediction tools such as theockall score10 or PNED score11 have been developedith conventional statistical procedures. These models uselimited number of variables because traditional statisticalpproaches tend to select only variables that have a certainevel of correlation with the outcome variable (top-downomputation, ie, a model is applied to data). There are 3ajor pitfalls in this algorithm: (1) the inability to capture

he diseases’ interactions, (2) the inability to capture therocess dynamics, and (3) the very large CI of individualisk assessment. Hence, classical statistical approachesave intrinsic limitations in handling this kind of nonlinearnd complex information.

ANNs are adaptive models that use a dynamic approacho analyzing the risk of mortality and are able to modifyhe internal structure in relation to a functional objectivebottom-up computation, ie, the data themselves generatehe model). Despite their inability to deal with missingata, ANNs are able to simultaneously handle a very highumber of variables, building up models that take intoccount outliers and nonlinear interactions among vari-bles. Therefore, although conventional statistics revealarameters that are significant for the entire populationnly, ANNs include parameters that might not reach sig-ificance for the entire population but are highly signifi-ant at the individual level. Furthermore, unlike standardtatistical tests, ANNs can manage complexity even withmall samples and unbalanced ratio between variablesnd records. In this respect, ANNs overcome the problemf dimensionality. We believe that the large and homoge-eous dataset used in the present study might have pro-ided a robust basis for training the network, by using alllinical variables shown to have a potential impact onortality in previous logistic regression models.9-11

In patients with acute nonvariceal UGIH, we have

hown that the nonendoscopic ANN model performed

olume 73, No. 2 : 2011 GASTROINTESTINAL ENDOSCOPY 221

sptiormahoRmwwsiIwpod(rm

Neural networks predict mortality in nonvariceal bleeding Rotondano et al

222 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011

ignificantly better than the complete Rockall score10 inredicting 30-day mortality. To the best of our knowledge,his is the first prospective study using a supervised ANNn the prediction of death from nonvariceal UGIH. Thenly other similar study in the literature, recorded in 1998,eported the use of an ANN in predicting the major stig-ata of ulcer bleeding and the need for endoscopic ther-

py in a dataset of UGIH patients.16 These are surrogateigh-risk markers and do not fully encompass all adverseutcomes, including recurrent bleeding and death. Theockall score10 was originally developed to predict patientortality and not the outcomes used in that study. Thisould partially explain why, although ANNs performedell, the model was not completely accurate and had low

pecificity, a point of concern because none of the patientsncorrectly categorized had atypical clinical presentations.n our study, ANNs were used to predict mortality andere compared with a specifically designed death-rediction tool. The ANN showed significantly superiorverall accuracy compared with the Rockall score in pre-icting the risk of death from acute nonvariceal UGIH96.8% vs 52.9%). Other outcome parameters, such asebleeding or need for endoscopic or surgical therapy,

Table 1. Continued

FeaturesFrequency in the

population (N � 2380)

Gastroduodenal erosions 9.1% (7.9-11.1)

Neoplasia 5.4% (4.0-6.9)

Mallory-Weiss tears 4.5% (3.8-5.5)

Esophagitis 5.6% (3.0-8.1)

No lesion identified 5.9% (4.7-6.6)

SRH at ulcer base (N � 1530)

Low risk (clean base, flat spot) 60.9% (58.3-69.6)

High risk (active bleeding, visiblevessel, adherent clot)

39.1% (35.0-41.4)

Need for endoscopic therapy 41.0% (38.9-44.4)

Helicobacter pylori infection rate 44.3% (41.3-47.4)

Recurrent bleeding 5.16% (3.9-6.9)

Mortality 4.7% (3.5-5.7)

Rockall score,‡ mean (� SD) 4.52 � 2.1

SD, Standard deviation; CAD, coronary artery disease; CHF,congestive heart failure; ASA, American Society of Anesthesiologists;NSAID, nonsteroidal anti-inflammatory drugs; SRH, stigmata ofrecent hemorrhage.*Values are mean (95% confidence intervals or SD).†ASA score refers to the classification of a patient’s severity andacuity of disease index. A total of 108 patients (4.53% of thepopulation) had at least one missing datum.‡Rockall TA, Logan RF, Devlin HB, et al.10

Table 1. Demographic and clinical features of the studypopulation*

FeaturesFrequency in the

population (N � 2380)

Male sex 65% (62.8-67.7)

Age (y), mean (� SD) 68 � 16

In-hospital bleeding 15% (12-16)

History of peptic ulcer 16.0% (14-17)

History of acute GI bleeding 28.5% (26.0-29.9)

Previous gastric surgery 5.9% (4.2-7.30)

No. of comorbidities (median, range) 1.0% (1.0, 0-2)

Comorbidities

Cardiovascular (CAD, CHF,hypertension, stroke)

50.4% (48.3-52.1)

Pulmonary 12.8% (11.3-13.0)

Diabetes 15.2% (14.2-16.0)

Chronic renal failure 12.0% (11.1-13.1)

Neoplasia 17.0% (15.9-18.9)

Cirrhosis 7.0% (6.0-8.1)

ASA score†

1-2 55.3% (52.9-58.5)

3 34.5% (32.4-36.9)

4 10.2% (8.7-11.6)

Symptoms on presentation

Melena 72% (70.2–76.5)

Hematemesis 25% (22.8-28.0)

Hemodynamic instability(orthostasis or shock)

19% (18.1-23.2)

Medications at presentation

NSAID 36% (32.1-37.9)

Low-dose aspirin (75-100 mg/day) 18.2% (15.6-20.1)

Anticoagulants 8.2% (6.7-10.0)

Laboratory test results (median,range)

Hemoglobin, g/dL 9.6 (9.0, 6.9-11.4)

Hematocrit, % 31 (30, 24-35)

International normalized ratio 1.3 (1.2, 1.0-1.4)

Timing of endoscopy (median,range), hours

8 (11, 1-18)

Source of bleeding at endoscopy

Duodenal ulcer 37.8% (34.9-40.8)

Gastric ulcer 27% (24.6-30.2)

ay have an impact on the level of care and inherent

www.giejournal.org

Rotondano et al Neural networks predict mortality in nonvariceal bleeding

TABLE 2. Input clinical variables used to build the ANN

Demographic Co-prescriptions

1 Male 36 Aspirin

2 Female 37 Ticlopidine

3 Age* 38 Clopidogrel

4 In-hospital bleeding 39 Rofecoxib

History 40 Celecoxib

5 History of acute GI bleeding 41 NSAID

6 History of peptic ulcer 42 Oral anticoagulants

7 Previous gastric surgery 43 Calcium antagonists*

Comorbidity 44 Nitrates

8 Hypertension 45 Heparin*

9 COPD* 46 Steroids

10 Chronic renal failure* 47 Antidepressants

11 Dialysis Features at presentation

12 Coronary artery disease 48 Syncope

13 Congestive heart failure* 49 Bright red blood emesis*

NYHA class 50 Melena

14 0 51 Coffee-ground emesis

15 I 52 Bright red blood per rectum

16 II 53 Black tarry stools

17 III Features at initial assessment

18 IV 54 Shock*

19 Diabetes* 55 Heart rate �100 beats per minute*

20 Cancer* 56 Systolic blood pressure �100 mm Hg*

21 Leukemia/lymphoma 57 Hemoglobin �7 g/dL*

Karnofsky performance scale 58 Hematocrit

22 0 59 Platelet count

23 10 60 Blood urea nitrogen

24 20 61 Prothrombin time (INR)

25 30 61 Red blood from the NGT

26 40 62 Black material from the NGT

27 50 ASA score*

28 60 63 I

29 70 64 II

30 80 65 III

31 90 66 IV

32 100 Other

33 Liver cirrhosis* 67 Time from symptoms to hospital admission �8 hours*

www.giejournal.org Volume 73, No. 2 : 2011 GASTROINTESTINAL ENDOSCOPY 223

8etphgRoorweld(

tAfaoptPu

Neural networks predict mortality in nonvariceal bleeding Rotondano et al

resource consumption but are less often related to theultimate outcome of patients with nonvariceal upper GIbleeding. The Hong Kong group22 recently reported a6.2% overall 30-day mortality rate, but only 20% of deathswere bleeding related. Of these, only about 30% were due

Figure 2. Receiver operating characteristic curves for the different pre-dictive instruments in predicting mortality. The area under the curve forthe artificial neural network–based model was significantly superior tothe complete Rockall score10 (0.95 [0.92-0.98] vs 0.67 [0.65-0.69]; P �.001).

TABLE 2. Continued

34 Cerebrovascular disease (stroke, transientischemic attack)*

35 Other disease(s)

ANN, Artificial neural network; COPD, chronic obstructive pulmonary disease;INR, international normalized ratio; NGT, nasogastric tube; ASA, American Soci*Variables retained in the final ANN model.

TABLE 3. Predictive performance of ANN and Rockall score (me

ANN

Accuracy (%) 96.8 (96.0-97.5)

Sensitivity (%) 83.8 (76.7-90-8)

Specificity (%) 97.5 (96.8-98.2)

PPV (%) 63.3 (55.3-71.3)

NPV (%) 99.1 (98.7-995)

LR � 33.8 (22.4-44.9)

LR � 0.17 (0.11-0.26)

ANN, Artificial neural network; PPV, positive predictive value; NPV, negative pr*Numbers in parentheses are 95% confidence intervals.

to uncontrolled or recurrent bleeding. p

224 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011

The accuracy of the Rockall score10 was reported to be1% in one external validation study.23 Time-trend differ-nces in epidemiology of the disease, patient characteris-ics, and endoscopic and pharmacologic therapy may ex-lain the large difference, with the 67% accuracy recordederein. Although the predictive performance is down-raded from the original modeling, the advantage of theockall score is that it is simple to use by physicians duringr before endoscopy procedures and that the calculationf the risk of mortality can be done quickly. ANNs use aelatively small subset of readily available clinical data,ithout requiring the input of information obtained atndoscopy. In our experience, ANNs were easy to use:ess than 5 minutes to enter the clinical variables into theataset and less than 2 minutes for the calculation timedata not shown).

The absence of a satisfactory modeling explanation ishe main shortcoming of ANNs in scientific research.NNs can qualitatively demonstrate the existence of risk

actors for mortality only, but, despite high sensitivitynd specificity, they cannot demonstrate the effect sizef each risk factor. Therefore, in exploring the potentialredictive ability of the ANN, it was decided to applyhe supervised ANN to the entire populations of theNED studies, testing the same set of clinical variablessed for logistic regression modeling.9,11 The relevant

Time from symptoms to endoscopy

New York Heart Association; NSAID, nonsteroidal anti-inflammatory drugs;Anesthesiologists.

agnostic yield of best predictive models)*

Complete Rockall P value

52.9 (50.8-55.0) � .001

71.4 (62.8-80.0) � .01

52.0 (49.8-54.2) � .001

7.0 (5.5-8.6) � .001

97.2 (96.3-98.2)

1.49 (1.31-1.69)

0.55 (0.40-0.74)

e value; LR �, likelihood ratio positive; LR �, likelihood ratio negative.

68

NYHA,ety of

an di

edictiv

redictors of 30-day mortality were increased age, use

www.giejournal.org

oiaSrairbvftalmopdvwe

p

dcm

A

p(twt

R

1

1

1

1

1

1

1

1

Rotondano et al Neural networks predict mortality in nonvariceal bleeding

of anticoagulants, patient general health status, thepresence of comorbidity, hemodynamic instability, pre-sentation with bright red hematemesis, low hemoglobinlevels, and time elapsed from symptoms to hospitaladmission. Interestingly, 7 of 8 clinical variables of therecently published PNED score11 were picked up by theANN system. This emphasizes the fact that some clinicalparameters are likely to affect the risk of death with aheavier prognostic impact, independently of the scoringsystem adopted.

It is important to emphasize that the ANN is notmeant to replace or to substitute for endoscopic evalu-ation and/or for the clinical judgment of an experiencedclinician. ANNs should be considered as research toolsto aid in comprehending and validating yet unexploredcomplex interactions occurring in the individual patient,which at the moment cannot be mathematically formal-ized with frequency statistics. The potential target audi-ence for ANNs may be the medical and paramedicalpersonnel of emergency departments, which, with thehelp of the software (already commercially available),may predict patient outcomes better than with otherscoring systems.

A number of limitations to the present study shouldbe acknowledged. First, the ANN model was not com-pared with a totally nonendoscopic risk score such asthe Blatchford score,24 but that instrument was devel-ped to predict the need for treatment and not mortal-ty. We decided to compare the ANN with models thatddressed mortality as the primary outcome measure.econd, patients with variceal and obscure GI hemor-hage were excluded from the study. We focused ourttention on patients with nonvariceal upper GI bleed-ng because these have been best studied in terms ofisk stratification, and in these patients triage has beeneneficial. Third, although supervised ANNs have beenalidated with a training and testing procedure on dif-erent patients belonging to the same sample popula-ion, our predictive instrument needs to be validated on

subsequent independent population. Last but noteast, we are aware that the specific focus on 30-dayortality as the endpoint of this tool may limit theverall clinical utility of ANNs to a small subset ofatients who have a high likelihood of dying within 30ays. Nonetheless, we believe that our effort can pro-ide prognostic information for a “major warning” to-ard those patients at high risk of the most dreadedvent, that is, death.

In conclusion, a nonendoscopic ANN-based model canerform better than the complete Rockall score10 in pre-

dicting death in patients with acute nonvariceal UGIH.ANNs should be considered as promising tools to improveprognostication in these patients in the emergency arena.Future studies will need to evaluate the true clinical rele-vance of ANNs, verify whether the new variables consid-

ered in the model are confirmed by other studies, and

www.giejournal.org V

etermine whether these instruments used by cliniciansan predict the investigated outcome and modify clinicalanagement of the patients.

CKNOWLEDGMENT

The PNED initiative was a collaborative effort sup-orted by the Italian Society for Digestive EndoscopySIED) and the Italian Association of Hospital Gastroen-erologists (AIGO). We acknowledge the great deal ofork performed by medical and nursing staff in each of

he participating units.

EFERENCES

1. Vreeburg EM, Snel P, de Bruijne JW, et al. Acute upper gastrointestinalbleeding in the Amsterdam area: incidence, diagnosis, and clinical out-come. Am J Gastroenterol 1997;92:236-43.

2. van Leerdam ME, Vreeburg EM, Rauws EA, et al. Acute upper GI bleed-ing: did anything change? Time trend analysis of incidence and out-come of acute upper GI bleeding between 1993/1994 and 2000. Am JGastroenterol 2003;98:1494-9.

3. Loperfido S, Baldo V, Piovesana E, et al. Changing trends in acuteupper-GI bleeding: a population-based study. Gastrointest Endosc2009;70:212-24.

4. Barkun AN, Bardou M, Kuipers EJ, et al; International Consensus UpperGastrointestinal Bleeding Conference Group. Consensus recommenda-tions for managing patients with nonvariceal upper gastrointestinalbleeding. Ann Intern Med 2010;152:101-3.

5. Gralnek IM, Barkun AN, Bardou M. Management of acute bleeding froma peptic ulcer. N Engl J Med 2008;359:928-37.

6. Cebollero-Santamaria F, Smith J, Gioe S, et al. Selective outpatient man-agement of upper gastrointestinal bleeding in the elderly. Am J Gastro-enterol 1999;94:1242-7.

7. Cipolletta L, Bianco MA, Rotondano G, et al. Outpatient management forlow-risk non variceal upper gastrointestinal bleeding: a randomizedcontrolled trial. Gastrointest Endosc 2002;55:1-5.

8. Cipolletta L, Rotondano G, Bianco MA. Gastrointestinal bleeding. Endos-copy 2007;39:7-10.

9. Marmo R, Koch M, Cipolletta L, et al. Predictive factors of mortality fromnon variceal upper gastrointestinal haemorrhage: a multicenter study.Am J Gastroenterol 2008;103:1639-47.

0. Rockall TA, Logan RF, Devlin HB, et al. Risk assessment after acute uppergastrointestinal haemorrhage. Gut 1996;38:316-21.

1. Marmo R, Koch M, Cipolletta L, et al. Predicting mortality in non varicealupper gastrointestinal bleeders: validation of the Italian PNED score andprospective comparison with the Rockall score. Am J Gastroenterol2010;105:1284-91.

2. Chiu PW, Ng EK, Cheung FK, et al. Predicting mortality in patients withbleeding peptic ulcers after therapeutic endoscopy. Clin GastroenterolHepatol 2009;7:311-6.

3. Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. Lan-cet 1995;346:1075-9.

4. Baxt WG. Application of artificial neural networks to clinical medicine.Lancet 1995;346:1135-8.

5. Das A, Ben-Menachem T, Cooper GS, et al. Prediction of outcome inacute lower gastrointestinal haemorrhage based on an artificial neuralnetwork: internal and external validation of a predictive model. Lancet2003;362:1261-6.

6. Das A, Ben-Menachem T, Farooq FT, et al. Artificial neural network as apredictive instrument in patients with acute nonvariceal upper gastro-intestinal hemorrhage. Gastroenterology 2008;134:65-74.

7. Buscema M. Genetic Doping Algorithm (GenD): theory and applications.

Expert Syst 2004;21:63-7.

olume 73, No. 2 : 2011 GASTROINTESTINAL ENDOSCOPY 225

2

2

2

Neural networks predict mortality in nonvariceal bleeding Rotondano et al

18. Haykin S. Neural networks: a comprehensive foundation. New York:McMillan Publishing; 1994.

19. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas undertwo or more receiver operating characteristic curves: a nonparametricapproach. Biometrics 1988;44:837-45.

20. Hanley JA, McNeil BJ. The meaning and use of the area under a receiveroperating characteristic (ROC) curve. Radiology 1982;143:29-36.

21. Chiu PW, Ng EK. Predicting poor outcome from acute upper gastroin-

testinal hemorrhage. Gastroenterol Clin North Am 2009;38:215-30.

226 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011

2. Sung JJ, Tsoi KK, Ma TK, et al. Causes of mortality in patients with pepticulcer bleeding: a prospective cohort study of 10,428 cases. Am J Gastro-enterol 2010;105:84-9.

3. Vreeburg EM, Terwee CB, Snel P, et al. Validation of the Rockall riskscoring system in upper gastrointestinal bleeding. Gut 1999;44:331-5.

4. Blatchford O, Murray WR, Blatchford M. A risk score to predict needfor treatment for upper-gastrointestinal hemorrhage. Lancet 2000;

356:1318-21.

Read Articles in Press Online Today!Visit www.giejournal.org

Gastrointestinal Endoscopy now posts in-press articles online in advance of their ap-pearance in the print edition of the Journal. These articles are available at the Gastroin-testinal Endoscopy Web site, www.giejournal.org, by clicking on the “Articles in Press”link, as well as at Elsevier’s ScienceDirect Web site, www.sciencedirect.com. Articles inPress represent the final edited text of articles that are accepted for publication but notyet scheduled to appear in the print journal. They are considered officially published asof the date of Web publication, which means readers can access the information andauthors can cite the research months prior to its availability in print. To cite Articles inPress, include the journal title, year, and the article’s Digital Object Identifier (DOI),located in the article footnote. Please visit Gastrointestinal Endoscopy online today toread Articles in Press and stay current on the latest research in the field of gastrointestinalendoscopy.

www.giejournal.org

a

obenStirta

(

(

aiocnops

Rotondano et al Neural networks predict mortality in nonvariceal bleeding

APPENDIX

The PNED investigators group includes RiccardoMarmo, Ospedale Curto, Polla; Livio Cipolletta, GianlucaRotondano, and Maria A. Bianco, Ospedale Maresca, Torredel Greco; Lucio Capurso, Maurizio Koch, and AngeloDezi, ACO San Filippo Neri, Rome; Angelo Pera andRodolfo Rocca, Ospedale Mauriziano Umberto I, Torino;Fausto Barberani and Sandro Boschetto, Ospedale SanCamillo De Lellis, Rieti; Alfredo Pastorelli and Elena SanzTorre, Ospedale Bel Colle, Viterbo; Sergio Brunati andRenato Fasoli, Ospedale Cantù, Abbiategrasso; IvanoLorenzini and Ugo Germani, AO Umberto I, Ancona; Gior-gio Minoli and Giorgio Imperiali, Ospedale Valduce,Como; Giovanni Gatto and Mariano Amuso, AO Villa So-fia, Palermo; Massimo Proietti and Anna Tanzilli, OspedaleDel Prete, Pontecorvo; Walter Piubello, Maria Tebaldi, andFabrizio Bonfante, Ospedale di Desenzano, Desenzanodel Garda; Renzo Cestari and Domenico Della Casa, AOOspedali Civili, University of Brescia, Brescia; PaoloMichetti and Paola Romagnoli, Ospedali Galliera, Genova;Omero Triossi and Andrea Buzzi, Ospedale Santa Mariadelle Croci, Ravenna; Alessandro Casadei and ClaudioCortini, AO Morgagni, Forlì; Giorgio Chiozzini and LisaGirardi, Ospedale Umberto I, Mestre; Luciano Allegrettaand Salvatore Tronci, Ospedale Santa Caterina Novella,Galatina; Giovanni Aragona and Francesco Giangregorio,Ospedale Civile, Piacenza; Sergio Segato and GiuseppeChianese, AO Ospedale Circolo e Fondazione Macchi,Varese; Andrea Nucci and Francesca Rogai, AO OspedaleCareggi, Firenze; Giampiero Bagnalasta and ClaudioLeoci, Ospedale Civile, Manerbio; Giovanni Di Matteo andPaolo Giorgio, IRCCS Ospedale De Bellis, CastellanaGrotte; and Mario Salvagnini, Ospedale San Bortolo,Vicenza.

The TWIST system (Semeion, Rome, Italy) is an ensem-ble of 2 algorithms: training and testing (T&T) and inputselection (IS). The T&T algorithm is based on a populationof n ANNs managed by an evolutionary system. In itssimplest form, this algorithm reproduces several distribu-tion models of the complete dataset D� (one for everyANN of the population) in 2 subsets (d[tr], the training set,nd d[ts], the testing set). During the learning process, each

ANN, according to its own data distribution model, istrained on the subsample d[tr], and blind-validated on thesubsample d[ts]. The performance score reached by eachANN in the testing phase represents its “fitness” value (ie,the individual probability of evolution). The genome ofeach “network-individual” thus codifies a data distributionmodel with an associated validation strategy. The n datadistribution models are combined according to their fit-ness criteria by using an evolutionary algorithm. The se-lection of network-individuals based on fitness determinesthe evolution of the population, that is, the progressive

improvement of performance of each network until the d

www.giejournal.org Volu

ptimal performance is reached, which is equivalent to theetter division of the global dataset into subsets. Thevolutionary algorithm mastering this process, named Ge-etic Doping Algorithm (GenD), which was created atemeion Research Centre, has characteristics similar tohose of a genetic algorithm but it is able to maintain annner instability during the evolution, carrying out a natu-al increase of biodiversity and a continuous “evolution ofhe evolution” in the population. The elaboration of T&T isrticulated in 2 phases:

1) Preliminary phase: In this phase, an evaluation of theparameters of the fitness function that will be used onthe global dataset is performed. During this phase, aninductor D[�tr],A,F,Z(.) is configured, which consists ofan ANN with an algorithm (A) back-propagation stan-dard. For this inductor, the optimal configuration toreach the convergence is stabilized at the end of dif-ferent training trials on the global dataset DT; in thisway, the configuration that most “suits” the availabledataset is determined—the number of layers and hid-den units and some possible generalizations of thestandard learning law. The parameters thus deter-mined define the configuration and the initialization ofall the individual-networks of the population and willthen stay fixed in the following computational phase.Basically, during this preliminary phase there is a fine-tuning of the inductor that defines the fitness values ofthe population’s individuals during evolution. The ac-curacy of the ANN performance with the testing setwill be the fitness of that individual (that is, of thathypothesis of distribution into 2 halves of the wholedataset).

2) Computational phase: The system extracts from theglobal dataset the best training and testing sets. Duringthis phase, the individual-network of the population isrunning, according to the established configurationand the initialization parameters. From the evolutionof the population, managed by the GenD algorithm,the best distribution of the global dataset DT into 2subsets is generated, starting from the initial popula-tion of possible solutions � � (D�

[tr], D�[ts]). Prelimi-

nary experimental sessions are performed by usingseveral different initializations and configurations ofthe network in order to achieve the best partition ofthe global dataset.

At the same time run IS, an adaptive system, which islso based on the evolutionary algorithm GenD and whichs able to evaluate the relevance of the different variablesf the dataset in an intelligent way. Therefore, it can beonsidered on the same level as a feature selection tech-ique. From a formal point of view, IS is an artificialrganism based on the GenD algorithm and consists of aopulation of ANNs, in which each one carries out aelection of the independent variables on the available

atabase.

me 73, No. 2 : 2011 GASTROINTESTINAL ENDOSCOPY 226.e1

smt

Neural networks predict mortality in nonvariceal bleeding Rotondano et al

The elaboration of IS, as for T&T, is developed in 2phases:(1) Preliminary phase: During this phase, an inductor

D[�tr],A,F,Z(.) is configured to evaluate the parametersof the fitness function. This inductor is a standardback-propagation ANN. The parameters configurationand the initialization of the ANNs are carried out withparticular care to avoid possible overfitting problemsthat can be present when the database is characterizedby a high number of variables that describe a lowquantity of data. The number of epochs E0 necessaryto train the inductor is determined through prelimi-nary experimental tests.

(2) Computational phase: The inductor is active, accord-ing to the stabilized configuration and the fixed ini-tialization parameters, to extract the most relevantvariables of the training and testing subsets. Eachindividual-network of the population is trained on thetraining set D’[tr] � and tested on the testing set D’[ts] �.The evolution of the individual-network of the popu-

lation is based on the algorithm GenD. In the IS ap- l

226.e2 GASTROINTESTINAL ENDOSCOPY Volume 73, No. 2 : 2011

proach, the GenD genome is built by n binary values,where n is the cardinality of the original input space.Every gene indicates whether an input variable is to beused or not during the evaluation of the populationfitness. Through the evolutionary algorithm, the dif-ferent “hypotheses” of variable selection, generated byeach ANN of the population, change over time, at eachgeneration. This leads to the selection of the bestcombination of input variables. As in the T&T systems,genetic operators crossover and mutation are appliedon the ANN population; the rates of occurrence forboth operators are self-determined by the system in anadaptive way at each generation. When the evolution-ary algorithm no longer improves its performance, theprocess stops, and the best selection of the inputvariables is used on the testing subset.

In order to improve the speed and the quality of theolutions that have to be optimized, the GenD algorithmakes the evolutionary process of the artificial popula-

ions more natural and less centered on the individual

iberalism culture.

www.giejournal.org