illness in the of logistic regression · pdf filea comparison of logistic regression and ......

of 140/140
SEVERITY OF ILLNESS SCORDJG IN THE INTENSIVE CARE UNIT: A COMPARISON OF LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORKS Gordon S. Doig Graduate Program in Epidemiology and Biostatistics Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Faculty of Graduate Studies The University of Western Ontario London, Ontario April, 1999 O Gordon S. Doig 1999

Post on 06-Mar-2018

216 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

  • SEVERITY OF ILLNESS SCORDJG IN THE INTENSIVE CARE UNIT:

    A COMPARISON OF LOGISTIC REGRESSION AND

    ARTIFICIAL NEURAL NETWORKS

    Gordon S. Doig

    Graduate Program in Epidemiology

    and Biostatistics

    Submitted in partial fulfillment

    of the requirements for the degree of

    Doctor of Philosophy

    Faculty of Graduate Studies

    The University of Western Ontario

    London, Ontario

    April, 1999

    O Gordon S. Doig 1999

  • National Library I*I of Canada Bibliothque nationale du Canada Acquisitions and Acquisitions et Bibiiographic Services sewices bibliogrphiques

    395 Wellington Street 395. rue Wellington OnawaON KtAON4 Ottawa ON K1A ON4 Canada Canada

    Your file Votre referens

    Our file Notre relerence

    The author has granted a non- L'autew a accord une licence non exclusive licence dowing the exclusive permettant la National Library of Canada to Bibliothque nationale du Canada de reproduce, loan, distribute or se11 reproduire, prter, distribuer ou copies of this thesis in microform, vendre des copies de cette thse sous paper or electronic formats. la foxme de microfiche/film, de

    reproduction sur papier ou sur format lectronique.

    The author retains ownership of the L'auteur conserve la proprit du copyright in th is thesis. Neither the droit d'auteur qui protge cette thse. thesis nor substantial extracts tom it Ni la thse ni des extraits substantiels may be printed or othenvise de celle-ci ne doivent tre imprims reproduced without the author's ou autrement reproduits sans son permission. autorisation.

  • Abstract

    Purpose: To compare the predictive performance of a senes of logistic regression models

    (LMs) to a corresponding senes of back-propagation artificial neural networks (ANNs).

    Location: A 30 bed addt general intensive care unit (ICU) that serves a 600-bed tertiary

    care teaching hospital.

    Patients: Consecutive patients with a duration of ICU stay greater than 72 hours.

    Outcome: KU-based mortality.

    Methods: Data were collected on day one and day three of stay using a modified

    APACHE II methodology. A randomly generated 8 1 1 patient developrnental database

    was used to build models using day one data (LM 1 and ANN 1 ), day three data (LM2 and

    ANN2) and a combination of day one and day three data (LMo~ and ANNoT). Primary

    cornparisons were based on area under the receiver operating curves (aROC) as measured

    on a 338 patient validation database. Outcome predictions were also obtained fiom

    experienced ICU clinicians on a subset of patients.

    Results: Of the 3,728 patients admitted to the ICU during the period fiom March 1, 1994

    through February 28, 1996, 1,181 qualified for entry into the study. There was no

    significant difference between LM and ANN models developed using day one data. The

    ANN developed using day three data performed significantly better than the

    corresponding LM @ROC LM2 0.7158 vs. ANN2 0.7845, p=0.0355). The time

    dependent ANN mode1 dso performed significantly better than the corresponding LM

    (&OC LMoT 0.7342 VS. ANNoT 0.8095, p=0.0140).

    The predictions obtained fiom ICU consultants (&OC 0.82 10) discriminated

    significantly better than L M o ~ (&OC 0.6814, p=0.0015) but there was no difference

    between the consultants and AbNor (aROC 0.8094, p=0.7684).

    Conclusion: Although the 1,181 patients who became eligible for entry intn this sudy

    represented only 32 percent of al1 ICU admissions, they accounted for 80 percent of the

    resources (costs) expended. ANNs demonstrated significantly better predictive

    performance in this clinically important group of patients. Four potential reasons are

    discussed: 1) ANNs are insensitive to problems associated with multicollinearity; 2)

  • ANNs place importance on novel predictors; 3) ANNs automatically model nonlinear

    relationships and; 4) ANNs implicitly detect ail possible interaction terms.

    Keywords: intensive care, cntical care, severity-of-illness, logistic regression, artificid

    neural networks, genetic algorithms, back-propagation, receiver operating characteristic,

    predictive model building

  • TabIe of Contents CERTIFICATE OF EXAMINATION ....................................................................................................... II

    ................................................................................................................................................ ABSTRACT III ............................................................................................................................. TABLE OF CONTENTS V

    .................................................................................................................................... LIST OF TABLES VI1 ................................................................................................................................ LIST OF FIGURES WII

    .......................................................................................................................... LIST OF EQUATIONS VI11 ......................................................................................................................... LIST OF APPENDICES VIII

    .................................................................................................................. LIST OF ABBREMATIONS.. IX . CHAPTE R 1 INTRODUCTION ................................................................................................................ 1

    ..................................................................................................... . CHAPTER 2 LITERATURE REVIEW 3 ........................................................................ 2.1 MORTALITY PREDICTION iN THE INTENSIVE CARE UNIT 3

    ......................................................................................................... 2.2 THE Af ACHE S C O ~ G SYSTEM 3 ................................................................................................................................... 2.2.1 AP.4CHE 11 3 .................................................................................................................................. 2.2.2 APACHE III 5

    ....................................................... 3.3 THE MORTALITY PRO BABIL^ MODEL (MPM) SCORMG SYSTEM 7 ......................................................................................................................................... 2.3.1 MPMII 8

    ................................................................................. 2.4 SIMPLIFIED ACUTE PHYSIOLOGY SCORE (SAPS) 9 ................. .................................................................................................................... 2.4. / SAPS II ., I O

    ............................... .......-...---.-.-......*......*..*........................ 2.5 DAY OF SEVERITY-OF-ILLNEss SCORING ... 1 1 ..................................................................................................................................... 2.5. I APACHE l I

    ........................................................................................................................................... 2.22 MPM I 2

    ........................................................................................................................................... 2.5.3 SAPS 13 2.6 NEURAL NFlWORKS ........................................................................................................................... 14 2.7 EGSEARCH QUES~ONS ........................................................................................................................ 19

    . .......................................................................................................................... CHAPTER 3 METHODS 20 ........................................................................................................................................... 3.1 LOCATION 20

    3 -2 Emrcs ................................................................................................................................................ 20 3.3 PATIENT SELECTION AND DATA ABSTRACTION ................................................................................... 20

    3- 3.1 Consultant 's Predicted Outcome ..................... .. ................................................................ 2 2 3 -4 PRIMARY OUTCOME OF [MEREST ........................................................................................................ 22

    ............................~.................*...~-~................*.*~......~*......~...*.......................... 7 3 3.5 D ATM ASE VAUDAT~ON -- ............................................................................................................................................ 3.6 ANALYSIS 23

    3.6. I Descriptive statistics .................................................................................................................. 2 3 .................................................................................... 3.6.2 Developmental and validation data sets 2 3 ..................................................................................... 3 -7 LOGISTIC REGRESSION MODEL DEVELOPMENT 24

    ......................................... ................... 3.7.1 Basic logistic regression rnodeling methodology .... 24 ............................................................................ 3.7.2 Day I logistic regression mode1 development 25 ........................................................................... 3.7.3 Day 3 logistic regression model developrnent -25