using severity measures to predict the likelihood of death for pneumonia inpatients

9
Using Severity Measures to Predict the Likelihood of Death for Pneumonia Inpatients Lisa I. lezzoni, MD, MSc, Michael Shwartz, PhD, Arlene S. Ash, PhD, Yevgenia D. Mackiernan, BA OBJECTIVE: To see whether predictions of patients' likeli- hood of dying in-hospital dif[ered among severity methods. DESIGN: Retrospective cohort. PATIENTS: 18,016 persons 18 years of age and older man- aged medically for pneumonia; 1,732 (9.6%) in-hospital deaths. METHODS: Probability of death was calculated for eaeh pa- tient using logistic regression with age, age squared, sex, and each of five severity measures as the independent variables: 1) admission MedisGroups probability of death scores; 2) scores based on 17 admission physiologic variables; 3) Dis- ease Staging's probability of mortality model; the Severity Score of Patient Management Categories (PMCs); 4) and the All Patient Refined Diagnosis-Related Groups (APR-DRGs). Pa- tients were ranked by calculated probability of death; 5) rankings were compared across severity methods. Frequen- cies of 14 clinical findings considered poor prognostic indica- tors in pneumonia were examined for patients ranked differ- ently by different methods. RESULTS: MedisGroups and the physiology score predicted a similar likelihood of death for 89.2% of patients. In contrast, the three code-based severity methods rated over 25% of pa- tients differently by predicted likelihood of death when com- pared with the rankings of the two clinical data-based meth- ods (MedisGroups and the physiology score). MedisGroups and the physiology score demonstrated better clinical credi- bility than the three severity methods based on discharge ab- stract data. CONCLUSIONS: Some pairs of severity measures ranked over 25% of patients very differently by predicted probability of death. Results of outcomes studies may vary depending on which severity method is used for risk adjustment. KEY WORDS: severity of illness; death rates; pneumonia. J GEN INTERN MED 1996;11:23-31. justment recognizes that some patients are more likely to do poorly than others, 7-9 and that some providers have sicker caseloads. A variety of severity measures have been created to meet this health policy-directed needJ 5. lo-~2 These sever- ity methods are frequently proprietary, with their logic unavailable for outside examination. They are marketed to hospitals, health care information companies, govern- ment officials, members of state and federal legislatures, and even business leaders. Some states (e.g.. Pennsylva- nia. Iowa, Colorado), cities (e,g., Cleveland, Orlando), and payers have required providers to report information us- ing a particular severity method.1 5 Although these severity methods could significantly affect providers and patients, relatively little independent information is available about them. 13 In particular, no published analyses compare their clinical credibility. Un- like other clinical severity measures, which often consider wide-ranging patient attributes (e.g., physiologic stability, extent and complexity of comorbid illness, functional sta- tus, psychosocial and socioeconomic factors, preferences for outcomes)J 4 these severity tools generally focus on acute clinical findings and comorbid diagnoses. They aim to predict inpatient death or hospital resource consump- tion. These methods often rate patients using limited data. such as computerized hospital discharge ab- stracts, i5. ~6 or information gathered from medical records through abstraction protocols independent of patients' d i a g n o s e s . 17-19 Physicians must begin assessing severity tools given their potential impact on clinical practice. We focus here on five common severity measures and ask two major questions: (1) Do different severity methods predict simi- I n many health services delivery settings, comparing provider performance is a central strategy for control- ling costs while maintaining quality. 1 5 Managed care companies and self-insured employers increasingly com- pare physician and hospital performance to identify "pre- ferred providers. ''a-5 Provider "report cards" are published in local newspapers and magazines. 3-6 Individual hospi- tals and medical practices also monitor their own practice patterns to target areas for improvement and savings. Largely because they are easy to measure, provider-spe- cific patient mortality rates are a standard performance indicator for diseases when imminent death is common. Comparing patient death rates across providers, however, generally requires adjustment for patient risk. This ad- Received from the Division of General Medicine and Primary Care, Department of Medicine. Harvard Medical School, Beth Israel Hospital, the Charles A. Dana Research Institute, and the Harvard-Thorndike Laboratory (LIL YDM); Health Care Management Program and Operations Management Depart- ment, School of Management, Boston University (MS): and Health Care Research Unit. Section of General Internal Medi- cine. Evans Memorial Department of Clinical Research and Medicine, Boston University Medical Center (ASA). Supported by the Agency for Health Care Policy and Re-- search, under grant R01 HS06742-03. The views expressed are solely those of the authors. Address correspondence and reprint requests to Dr. lezzoni: Division of General Medicine and Primary Care, Department of Medicine, Beth Israel Hospital, 330 Brookline Avenue, Boston, MA 02215. 23

Upload: lisa-i-iezzoni

Post on 21-Aug-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Using Severity Measures to Predict the Likelihood of Death for Pneumonia Inpatients Lisa I. lezzoni, MD, MSc, Michael Shwartz, PhD, Arlene S. Ash, PhD, Yevgenia D. Mackiernan, BA

OBJECTIVE: To see whether predict ions of patients' likeli- hood of dying in-hospital dif[ered among severity methods .

DESIGN: Retrospective cohort.

PATIENTS: 18,016 persons 18 years of age and older man- aged medical ly for pneumonia; 1 ,732 (9.6%) in-hospital deaths.

METHODS: Probability of death was calculated for eaeh pa- t ient using logistic regression with age, age squared, sex, and each of five severity measures as the independent variables: 1) admiss ion MedisGroups probability of death scores; 2) scores based on 17 admiss ion physiologic variables; 3) Dis- ease Staging's probability of mortal i ty model; the Severity Score of Patient Management Categories (PMCs); 4) and the All Patient Refined Diagnosis-Related Groups (APR-DRGs). Pa- t i ents were ranked by calculated probability of death; 5) rankings were compared across severity methods . Frequen- c ies of 14 cl inical findings considered poor prognostic indica- tors in pneumonia were examined for pat ients ranked differ- ent ly by different methods .

RESULTS: MedisGroups and the physiology score predicted a similar l ikel ihood of death for 89.2% of patients . In contrast, the three code-based severity m e t ho ds rated over 25% of pa- t i ents differently by predicted l ikel ihood of death when com- pared wi th the rankings of the two clinical data-based meth- ods (MedisGroups and the physiology score). MedisGroups and the physiology score demonstrated better clinical credi- bil i ty than the three severity methods based on discharge ab- stract data.

CONCLUSIONS: Some pairs of severity measures ranked over 25% of pat ients very differently by predicted probability of death. Results of outcomes studies m a y vary depending on which severity method is used for risk adjustment .

K E Y WORDS: severity of illness; death rates; pneumonia .

J GEN INTERN MED 1996;11:23-31.

j u s t m e n t recognizes that some patients are more likely to do poorly than others, 7-9 and that some providers have sicker caseloads.

A variety of severity measures have been created to meet this health policy-directed needJ 5. lo-~2 These sever-

ity methods are frequently proprietary, with their logic unavailable for outside examination. They are marketed to hospitals, health care information companies, govern- men t officials, members of state and federal legislatures, and even bus iness leaders. Some states (e.g.. Pennsylva-

nia. Iowa, Colorado), cities (e,g., Cleveland, Orlando), and

payers have required providers to report information us- ing a part icular severity method.1 5

Although these severity methods could significantly affect providers and patients, relatively little independent information is available about them. 13 In particular, no

publ ished analyses compare their clinical credibility. Un- like other clinical severity measures, which often consider wide-ranging pat ient at t r ibutes (e.g., physiologic stability, extent and complexity of comorbid illness, functional sta- tus, psychosocial and socioeconomic factors, preferences for outcomes)J 4 these severity tools generally focus on

acute clinical findings and comorbid diagnoses. They aim

to predict inpat ient death or hospital resource consump-

tion. These methods often rate pat ients us ing limited data. such as computerized hospital discharge ab- stracts, i5. ~6 or information gathered from medical records

through abstract ion protocols independent of pat ients ' diagnoses. 17-19

Physicians mus t begin assessing severity tools given their potential impact on clinical practice. We focus here on five common severity measures and ask two major

questions: (1) Do different severity methods predict s imi-

I n many health services delivery settings, comparing provider performance is a central strategy for control-

ling costs while main ta in ing quality. 1 5 Managed care

companies and self-insured employers increasingly com-

pare physician and hospital performance to identify "pre- ferred providers. ''a-5 Provider "report cards" are published in local newspapers and magazines. 3-6 Individual hospi-

tals and medical practices also monitor their own practice

pat terns to target areas for improvement and savings. Largely because they are easy to measure, provider-spe-

cific pat ient mortality rates are a s tandard performance indicator for diseases when imminen t death is common. Comparing pat ient death rates across providers, however, generally requires ad jus tment for pat ient risk. This ad-

Received from the Division of General Medicine and Primary Care, Department of Medicine. Harvard Medical School, Beth Israel Hospital, the Charles A. Dana Research Institute, and the Harvard-Thorndike Laboratory (LIL YDM); Health Care Management Program and Operations Management Depart- ment, School of Management, Boston University (MS): and Health Care Research Unit. Section of General Internal Medi- cine. Evans Memorial Department of Clinical Research and Medicine, Boston University Medical Center (ASA).

Supported by the Agency for Health Care Policy and Re-- search, under grant R01 HS06742-03. The views expressed are solely those of the authors.

Address correspondence and reprint requests to Dr. lezzoni: Division of General Medicine and Primary Care, Department of Medicine, Beth Israel Hospital, 330 Brookline Avenue, Boston, MA 02215.

23

24 Iezzoni et al.. Predicting Death for Pneumonia Inpatients ]GIM

lar likelihoods of death for the same pat ients? and (2) If

not, which severity methods are more clinically credible?

METHODS

Severity Methodologies

We considered five severity measures (Table 1): the

empirically derived, disease-specific admission Medis-

Groups score; 19 a physiology score pat terned after the

acute physiology score (APS) component of the Acute

Physiology and Chronic Health Evaluation, third version

(APACHE Ill): 2°, 21 Disease Staging's empirically derived

scale predicting probability of in-hospital death; 22. 2a the

Severity Score of Patient Management Categories

(PMCs); 24 and the All Patient Refined Diagnosis-Related

Groups (APR-DRGs}. 2s. 26 These systems are representa-

tive of approaches used in severity-adjusting outcomes

data for state or regional comparisons across hospitals, 1-5

and for individual hospital activities.

Each measure ' s definition of severity reflects its origi-

nal goal; eaeh assigns either numerical scores or values

on a cont inuous scale (Table 1). APR-DRGs were not ini-

tially developed for mortality prediction; nonetheless they

are used for mortality analyses by individual hospitals

and some hospital associations. Disease Staging, PMC Se-

verity Score. and APR-DRGs use data elements from stan-

dard hospital discharge abstracts, Is, 16 such as age, sex,

and diagnoses and procedures coded using the Interna-

tional C las s~ca t ion o f Diseases , Ninth Revisiort, Clinical

Modification (ICD-9-CM). MedisGroups and the physiology

score use clinical data abstracted from medical records,

Database

The database is described in detail elsewhere. ~7

Briefly. we used the 1992 MedisGroups Comparative Da-

tabase, which contains information gathered by hospitals

using MedisGroup's data abstract ion protocol and sub-

mitted to its vendor, MediQual Systems, Inc. The data-

base includes all calendar year 1991 discharges from 108

acute care hospitals chosen by MediQual Systems for

good data quality and range of characterist ics, To ensure

adequate sample sizes for hospital-level analyses in an-

other study, 27 we eliminated three low-volume insti tu-

t ions (29 pat ients total). The American Hospital Associa-

tion annual survey provided information on hospital

characterist ics. The 105 hospitals were generally larger

and more involved in teaching than other general acute

care inst i tut ions nationwide. Fifty-nine hospitals were in

Pennsylvania, a state requiring all hospitals to report

MedisGroups data. ~-~ Admission MedisGroups scores were computed by

MediQual Systems by applying empirically derived weights

to clinical data abstracted by hospitals. ~9 These scores

were provided in the MedisGroups database. Scores for

the other methods had to be assigned. The database in-

Severity Method

Table I . Description of Five Severity Methods

Data Used and Source Definition of Severity a

Classification Approach and Derivation b

MedisGroups (Atlas MQ) 19

Physiology score

Disease staging 22, 2a

Patient management categories (PMCs) severity score 24

All patient refined diagnosis -related groups (APR-DRGs) 2s. 26

MediQual Systems, Inc.

Patterned after Acute Physiology Score, Acute Physiology and Chronic Health Evaluation III (APACHE III) 2°. 21

SysteMetrics/MEDSTAT Group

Pittsburgh Research Institute

3M Health Information Systems

Clinical data; in-hospital death: score calculated within 64 disease groups

Clinical data; in-hospital mortality for patients in intensive care unit

Discharge abstract: probability of in-hospital

death Discharge abstract; in-

hospital morbidity and mortality

Discharge abstract; total hospital charges

Probability ranging from 0 to 1; empirical modeling

Integer score starting with 0; APACHE lll's Acute Physiology Score ranges from 0 to 252; empirical modeling with clinical guidance

Probability ranging from 0 to 1: empirical modeling

Score of 1.2, 3, 4, 5, 6, or 7: empirical modeling following clinical guidance

Four severity classes (A. B, C, D} within adjacent DG1Rs; c empirical modeling with clinical guidance

a "Discharge abstract ~ indicates standard hospital discharge data elements, such as basic demographics, diagnosis, and procedure codes. "Clinical data" indicates clinical information (e.g., vital signs, test results) abstracted from the medical record. bDerivation indicates the principal method used to create the severity scoring method. "Clinical judgments" reJlects primarily Use of expert physician guidance. "Empirical modeling" indicates primarily use of statistical techniques. CDRG = diagnosis-related group; "adjacent DRGs" are formed by grouping individual DRGs previously split by complications and

comorbidities.

IGIM Volume ] 1. January 1996 25

cludes values of key clinical findings (KCFs) from the ad-

mission period (generally the first two hospital days) ab-

s tracted from medical records during MedisGroups

reviews. 19 We created physiology scores pat terned after

the APACHE III acute physiology score using these KCFs,

assigning weights specified by APACHE III (e.g., a pulse of

145 b e a t s / m i n u t e generated 13 points), 20 and summing

them to produce scores. We could not replicate exact

APACHE III acute physiology scores because complete

values for the required 17 variables were unavailable due

to MedisGroup's ins t ruct ions not to collect data in

broadly defined normal ranges. 28 A similarly derived

physiology score nonetheless performed well compared

with exact APACHE II acute physiology scores. 28. 29

The MedisGroups database also contains s tandard

discharge abstract information listed by hospitals, includ-

ing ICD-9-CM codes for up to 20 diagnoses and 50 proce-

dures. Vendors scored severity for the three discharge ab-

s t rac t -based methods, by applying their severity scoring

software to computer files containing necessary discharge

abstract data elements extracted by us from the Medis- Groups database.

Outcome Measure and Study Sample

Our outcome measure was in-hospital death, Infor-

mation on deaths following discharge was unavailable.

We studied adults hospitalized for medical t rea tment

of pneumonia . We initially selected patients with a princi-

pal diagnosis code indicating viral, bacterial, or other

nontubercu la r pneumonias or pneumonit is . (The list of

ICD-9-CM codes used for this initial selection was ob-

tained from and identical to that specified in 1991 by the

pneumonia Patient Outcome Research Team at the Uni-

versity of Pittsburgh, funded by the Agency for Health

Care Policy and Research. This list of ICD-9-CM codes is

available from the authors.} This approach identified

20,041 patients, including some in surgical DRGs (e.g.,

DRG 75, major chest procedures). We retained only pa-

tients with medical t rea tment identified by DRGs: 79/80,

respiratory infections and inflammations, which include

more serious pneumonias (e.g., Staphylococcus aureus.

Klebsiella, Pseudomonias): and 89/90 , simple pneumonia

and pleurisy (e.g., pneumococcal , streptococcal, Hemophi- )us influenzae pneumonias) .

Analytic Methods

The first research quest ion asked whether predic-

tions of in-hospital mortality for the same patients were

similar across the five severity methods.

Using each severity method, a predicted probability of

death was calculated for each pat ient from a multivari-

able logistic regression model including the severity score,

sex, age, and age-squared. Severity scores were entered

as either cont inuous or categorical variables (Table 1). For

Disease Staging and MedisGroups, we used the logit of

the probability as the independent variable in the logistic

regression.

Measures of Model Performance

We used the c statistic and R 2 as overall measures of

the ability of the models to predict individual patient out-

comes. The c statistic equals the area under a receiver op -

erating characterist ic (ROC) curve: s° it measures how well

models discriminate between patients who lived and

those who died. A c value of 0.5 indicates no ability to dis-

criminate, while a value of 1,0 indicates perfect discrimi- nation.3~. 32

Assessments of statistical performance (c and R 2) are

sometimes overly optimistic when regression models are

developed and evaluated using the same data. To guard

against this, we also calculated cross-validated c and R 2

values as follows: (1) data were randomly split in half; (2)

coefficients for each model were est imated on the first half

of the data. and "validated" performance measures were

calculated by applying these coefficients to the second

half; and (3l we repeated this process, developing the

model on the second half and validating it on the first half

of data. a3 Cross-validated c and R 2 statistics were the av-

erage of the validated statistics calculated on each half of

the data.

Ranking Patients by Predicted Probability of Death

For each severity method, we ranked patients by their

predicted probability of death based on the multivariable

model. We then divided patients into 10 equal groups

(deciles from 1 to 10) based on increasing likelihood of dy-

ing. We report the actual death rates and the expected

death rates (i.e.. mean predicted probability of death ×

100) for pat ients in each decile. These figures suggest how

well models separated pat ients with very high and very

low risks of death.

We created 10-by-10 tables, arraying patients within

probability of death deciles computed by one severity

method against decries computed by a second method.

Given that there were five severity methods, we produced

ten 10-by-10 tables, one for each of the ten pairwise com-

parisons. Average probabilities of death within each decile

indicated that differences of three or more deciles repre-

sented an important difference in the predicted likelihood

of dying between severity methods. For each 10-by-10 ta-

ble. we counted the fractions of pat ients who had "simi-

lar" predicted probabilities of death (probabilities calcu-

lated by severity methods A and B were within two deciles

of each other}; and "different" predicted probabilities of

death (probabilities calculated by severity methods A and

I3 were three or more decries apart). We separated pa-

t ients with "different" predicted probabilities of death into

two groups: those for whom the probabilities calculated

by one severity method were much higher than probabili-

26 Iezzoni et al., Predicting Death for Pneumonia Inpatients JGIM

ties calculated by the other severity method; and those for

whom the probabilities were much lower.

Analysis Involving Key Clinical Findings

After finding that different severity methods predicted

very different probabilities of death for the same patients,

our next quest ion was which severity measure is more

clinically credible?

Given that we could not go back to medical records to

review these cases, we investigated clinical credibility us-

ing 14 admiss ion MedisGroups KCFs (findings present in

the first two hospital days) described by the clinical litera-

ture as important prognostic indicators in pneumonia:

1. High respiratory rate (>40 b rea ths /minu te )

2. High rectal tempera ture (->40°C)

3. Coma

4. Lethargy

5. Disorientat ion

6. Low se rum sodium (Na < 130 mEq/l i ter)

7. High blood urea nitrogen (BUN -> 31 m g / 1 0 0 ml)

8. Low se rum albumin (ALB <- 2.4 g in /100 ml)

9. High white blood cell count (-> 15,000)

10. Low arterial oxygenation (pO 2 -< 60 mm Hg)

11. Low arterial pH (--<7.34)

12. Positive blood cul ture

13. Infiltrate on chest radiograph

14. History of cancer

We examined each KCF individually for its relation to

in-hospital death using two-by-two tables and X 2 statis-

tics. We also computed a logistic regression with the clini-

cal findings as follows: Death = Dummy variables for

each KCF + Age + Age-squared ÷ Sex. We report odds ra-

tios (with 95% confidence intervals) for each KCF from

this model. This model was computed only to assist in

evaluating the clinical credibility of the five severity meth-

ods. We did not aim to create a severity prediction model

for pneumonia , as has been recently reported. 34

We used these KCFs to assess the clinical credibility

of different severity methods, For each pair of severity

methods, we identified pat ients with very different likeli-

hoods of death predicted by the two methods (as de-

scribed above) and looked at the percentage of pat ients

within each group who had clinical findings associated

with an adjusted odds ratio of death significantly greater

than one. When one group had more clinical findings

than the other, that suggested bet ter clinical credibility

for the severity method that viewed those pat ients as

sicker (more likely to die).

ICD-9-CM Code-Derived Clinical Findings

As in s tandard discharge abstracts , the list of ICD-9-

CM diagnoses contains conditions treated throughout the

hospitalization, regardless of when they arose. One hy-

pothesis about the ability of discharge abs t rac t -based se-

verity methods to predict death is their reliance on ICD-9-

CM codes for conditions, such as cardiac arrest, arising

late in the hospital stay. To examine this possibility, we

used ICD-9-CM secondary diagnosis codes to define clini-

cal findings similar to those represented by the KCFs.

Given the l imitations of ICD-9-CM, this was possible for 7

of the 14 clinical findings. ICD-9-CM-based variables

were also created for cardiac arrest, respiratory arrest,

respiratory failure, and shock. Analyses identical to those

described above for the KCF-based clinical findings were

performed using these 11 ICD-9-CM-based clinical find-

ings (a list of the ICD-9-CM diagnosis codes is available

from the authors upon request).

RESULTS

The final data set contained 18.016 patients from

105 hospitals, with 1,732 (9.6%) in hospital deaths. Pa-

t ients ranged from 18 to 105 years of age, with a mean

(SD) age of 69.4 (18.1) years: 51.6% of pat ients were fe-

male. Mean (SD) length of stay was 9.1 [7.1) days. For the

discharge abs t rac t -based methods, ample numbers of di-

agnosis codes were generally present for rating severity.

The average case had 5.6 (SD, 2.9) diagnosis codes. Only

3.7% of pat ients had one discharge diagnosis; 45.7% had

more than five diagnoses listed, and 9.3% had 10 or more

diagnoses. The most common principal diagnosis was

pneumonia , organism unspecified (ICD-9-CM code 486,

59.3% of cases), followed by aspirat ion pneumonia (code

507, 8.0% of cases) and pneumococcal pneumonia (code

481, 5.7% of cases).

Statistical Performance Measures

The five severity methods varied in their statistical

performance (Table 2). MedisGroups had the highest c

and R ~ values; c and R 2 were fairly similar for the physiol-

ogy score, Disease Staging, and PMC Severity Score.

Cross-validated statistics were generally identical to val-

ues from the entire data set indicating that the models

were not overfit.

Predicted Probabilities of Death

Table 2 also shows the observed and predicted death

rates for each decile defined by increasing predicted prob-

ability of death. All severity methods arrayed patients

along wide ranges of predicted probabilities. MedisGroups

had the broadest range, with a predicted death rate of

0.7% for the lowest decile and 43.6% for the highest.

Table 3 indicates the fraction of patients with similar

and different predicted probabilities of death for the 10

pairs of severity measures and the percentage who died in

each group. MedisGroups and the physiology score gener-

ally ranked patients similarly, with 89.2% of pat ients hav-

ing probabilities that were similar (i.e., within two deciles

][GIM Volume I 1, J a n u a r y ] 9 9 6 27

Table 2. Measures of Model Performance for Predicting Death: Statistics Fit on All Cases and Cross-Validated Statistics i i

Severity Method

Statistical Performance Physiology Disease PMC Severity Measure MedisGroups Score Staging Score APR-DRGs

C

R 2

Cross-validated c Cross-validated R 2

Deciles by predicted probability of death

1 2 3 4 5 6 7 8 9

10

0.85 0.81 0.80 0.79 0.78 0.19 0.15 0.13 0.12 0.10 0.85 0.81 0.80 0.80 0.78 0.19 0.15 0.13 0.13 0.10

Observed Death Rate (Predicted Death Rate)

0.2 (0.7) 0.3 (1.0) 0.7 (1.4) 1.7 {2.2) 1.3 (2.2) 2.8 (3.4) 2.1 (3.0) 3.3 (4.4) 3.4 (4.0) 3.8 (5.3) 5.0 (5.4) 6.1 (6.5) 8.4 (7.4) 8.4 (8.0)

13.2 (10.6) 12.0 (10.4) 19.6 (17.9) 19.3 (15.2) 42.4 (43.6) 38.4 (39.7)

0.6 (0.8) 0.7 (1.8) 1.5 2.7) 3.5 3.7) 4.4 7.3

11.0 12.8 19.9 34.4

4.9) 6.5) 8.9) 12.4) 18.1) 36.4)

0.3 (1.0) 0.2 (0.9) 1.7 (2.1) 1.2 (2.0) 2.5 (3. l) 2.9 (3.1) 4.2 (4.1) 3.9 (4.1) 6.6 (5.3) 5.8 (5.2) 7.1 {6.8) 8.0 (6.7) 8.8 (8.9) 8.7 (8.9)

11.7 {11.4) 13.1 (12.0) 19.0 (18.01 21.3 (2O.O) 34.2 (35.4) 31,0 (33.2)

of e a c h other). In cont ras t , the code -based m e t h o d s (Dis-

ease Staging, PMC, a n d APR-DRGs) r a n k e d more t h a n

25% of pa t i en t s different ly by probabi l i ty of dea th com-

pa red wi th r a n k i n g s of the two cl inical d a t a - b a s e d m e t h -

ods (MedisGroups and the physio logy score). A m o n g the

code -based me thods , PMC Sever i ty Score and APR-DRGs

agreed the best , r ank ing 85 .6% of pa t i en t s s imi lar ly by

p red ic ted l ikel ihood of dying.

In all s ix c o m p a r i s o n s of cl inical d a t a - b a s e d m e t h o d s

wi th code -based me thods , pa t i en t s s een as hav ing h i g h e r

l ikel ihoods of d e a t h by the cl inical m e t h o d s were more

likely to die t h a n pa t i en t s v iewed as s icker by the code

me thods .

Clinical Findings Analysis

E a c h of t he 14 clinical f indings had a s t rong individ-

ua l re la t ion wi th in -hosp i t a l d e a t h (Table 4). The logistic

r eg res s ion inc lud ing age, age - squa red , a n d sex p lus the

14 KCFs y ie lded a c s ta t i s t ic of 0 .83 a n d R 2 of 0.16. W h e n

cont ro l l ing for o the r cl inical f indings, age, age - squa red ,

and sex, the odds ra t ios were s igni f icant for all cl inical

f indings excep t e leva ted whi te b lood cell coun t , posi t ive

b lood cu l ture , and inf i l t ra te on c h e s t r ad iog raph (Table 4).

The odds rat io for e levated t e m p e r a t u r e was 0.75, sug-

gest ing, a s in a n o t h e r s tudy, as t ha t pa t i en t s capab le of

m o u n t i n g febrile r e s p o n s e s a re less likely to die t h a n

t hose w h o canno t . Table 5 looks a t pa t i en t s wi th very dif ferent probabi l i -

t ies of d e a t h p red ic ted by pa i r s of sever i ty me thods . To

sugges t wh ich of each pair of sever i ty m e a s u r e s is more

cl inical ly credible, Table 5 shows the pe r cen t age of t he se

pa t i en t s who h a d e a c h of the 11 cl inical f indings t h a t sig-

n i f icant ly i nc rea sed the a d j u s t e d odds of dying. For exam-

ple, a m o n g pa t i en t s viewed as s icker by Med i sGroups

t h a n by Disease Staging, 14.5% h a d a h igh resp i ra to ry

ra te on admiss ion ; in cont ras t , for pa t i en t s viewed as

s i cker by Disease S tag ing t h a n by Medi sGroups , only

1.4% h a d high resp i ra to ry ra tes . For all cl inical var iab les

excep t h i s tory of cancer , pa t i en t s s e e n a s more likely to

Table 3. Percentage of Patients with Very Different Predicted Probabilities of Death Calculated by Pairs of Severity Measures (and Percentage of These Patients Who Died)

Severity Method Relative Predicted Probability of Death by Severity Method

A B A = B A > B A < B

MedisGroups Physiology score 89.2 MedisGroups Disease staging 73.4 MedisGroups PMC severity score 72.0 MedisGroups APR-DRG 73.2 Physiology score Disease staging 73.5 Physiology score PMC severity score 74.8 Physiology score APR-DRG 75.1 Disease staging PMC severity score 78 .7

Disease staging APR-DRG 79.3 PMC severity score APR-DRG 85.6

9.7) 5.9 (13.2) 4.9 (4.5) 9.6) 13.4 (11.8) 13.2 (7.2) 9.1) 14.0 (14.5) 14.0 (7.1} 9.1} 13.5 {14.4) 13.3 (7.7) 9.1} 13.1 (11.1) 13.4 (10.9) 8.9) 12.4 (13.9) 12.8 (9.8) 9.0} 12.3 (13.7) 12.6 (9.3) 9.7} 10.6 (11.1) 10.7 (7.2} 9.8) 10.2 (10.7) 10.5 (7.2) 9.6) 6.3 (9.2) 8.1 (10.0)

28 Iezzoni et al., Predicting Death for Pneumonia Inpatients JGIM

Table 4. Percentage of Patients with Clinical Finding, Percentage Who Died by Presence of Clinical Finding, and Adjusted Odds Ratio for Dying by Clinical Finding

Clinical Finding % Patients % Who Died % Who Died

with Finding with Finding without Finding OR

Adjusted OR a

95% CI p Value

Respiratory rate > 40 6.9 28.9 8.2* 2.61 Rectal tempera ture ~> 40°C 8.7 7.2 9.8** 0.75 Coma 4.8 39.7 8.1" 3.39 Lethargy 8.7 24.2 8.2* 1.86 Disorientation 5.0 18.4 9.2" 1.49 Serum sodium < 130 6.6 14.6 9.3* 1.65 Serum BUN -> 31 21.2 24.2 5.7* 2.80 Serum a lbumin _< 2.4 6.1 22.3 8.8* 2.01 White blood cell count -> 15,000 23.7 12.8 8.6* 1.08 Arterial pO 2 -< 60 26.4 14.1 8.0* 1.36 Arterial pH ~ 7.34 5.4 32.3 8.3* 2.87 Positive bIood culture 6.4 14.4 9.3* 1.08 Infiltrate on ches t radiograph 55.4 10.3 8.7* 1.05 History of cancer 14.3 16.7 8.4" 2.32

(2.23, 3.07) .0001 (0.60. 0.95) .01 (2.86, 4.02) .0001 (1.60, 2.16) .0001 (1.23. 1.80) .0001 (1.37, 2.00) .0001 (2.48, 3.15) .0001 (1.70, 2.38) .0001 (0.96, 1.21) .24 (1.20, 1.53) .0001 (2.42, 3.40) .0001 (0.88, 1.32) .44 (0.94, 1.18) .37 (2.04.2.65) .0001

aAdjusted for age, age-squared, sex, and other clinical findings. *p = .000. **p ~ .001 for comparison of died by presence or absence of clinical finding.

die by Med i sGroups were more l ikely to have the f inding

t h a n pa t i en t s viewed as relat ively more severe by Disease

Staging. This sugges t ed t h a t M e d i s G r o u p s h a d be t t e r

cl inical credibi l i ty t h a n Disease S tag ing as a m e a s u r e of

a d m i s s i o n severity.

Table 5 sugges t s t ha t the clinical d a t a - b a s e d m e t h -

ods were more cl inical ly credible t h a n the code -based

me thods . No code -based m e t h o d clear ly s tood ou t as the

m o s t cl inical ly credible. For example , in c o m p a r i n g Dis-

ea se S tag ing wi th APR-DRGs, six cl inical f indings favored

Disease S tag ing while five favored APR-DRGs.

Clinical Findings Based on ICD-9-CM Codes

Only six of t he ICD-9-CM-der ived cl inical f indings oc-

cu r r ed in at leas t 1% of pa t ien ts : r e sp i ra to ry failure.

3.6%; low sod ium, 6.4%; a c u t e and ch ron ic rena l failure.

1.5% and 2.1% respect ively; s ep t i cemia or bac t e remia .

3.3%; and cancer , 14.8%. All were a s soc ia t ed with a

m u c h inc rea sed r isk of in -hosp i ta l dea th . Viewed a long

wi th a d m i s s i o n KCFs, the d a t a sugges t ed tha t t he se ICD-

9-CM codes f requen t ly ref lected cond i t ions a r i s ing af ter

the first two hosp i ta l days. For example , of the 1,145 pa-

t i en ts wi th an ICD-9-CM code for hypona t r emia , only

42 ,9% h a d a KCF ind ica t ing low s o d i u m on admiss ion ; of

the 594 pa t i en t s wi th an ICD-9-CM code for s ep t i cemia or

bac t e remia , only 52 .2% h a d a KCF ind ica t ing a posi t ive

b lood cu l tu re on admiss ion . However , t he f indings re la t -

ing to c a n c e r - - a condi t ion cer ta in ly p reda t ing hospi ta l iza-

t i o n - r a i s e d conce rn a b o u t h o w b o t h KCFs a n d ICD-9-

CM codes a re de te rmined . Of the 2 ,666 pa t i en t s wi th a

c a n c e r code, only 79 .7% also h a d a KCF ind ica t ing

cancer .

F u r t h e r ana lys i s genera l ly s u p p o r t e d the hypo the s i s

tha t code -based sever i ty m e a s u r e s rely heavi ly on ICD-9-

CM codes r e p r e s e n t i n g grave cond i t ions a r i s ing l a te r in

the hospi ta l stay. For example , of pa t i en t s v iewed as

s icker by Disease S tag ing t h a n by the physiology score,

9 .5% had an ICD-9-CM code r ep re sen t i ng resp i ra to ry fail-

ure: in cont ras t , of pa t i en t s viewed as s icker by the physi -

ology score t h a n by Disease Staging. only 0 .1% had a res-

p i ra tory fai lure code. Code -based m e t h o d s also a p p e a r e d

to a s s ign relat ively h igher scores to pa t i en t s wi th a c a n c e r

ICD-9-CM code. For example , of pa t i en t s seen as s icker

by Disease S t ag ing t h a n the physiology score. 44 .0% had

a c a n c e r code: of pa t i en t s s een as s icker by the physiology

score t h a n by Di sease Staging, only 4 .8% had a c a n c e r

code.

DISCUSSION

S o m e pa i rs of sever i ty m e a s u r e s r a n k e d large frac-

t ions of pa t i en t s (over 25%) very differently by the i r pre-

d ic ted probabi l i ty of death . Many pa t i en t s t h o u g h t to have

relat ively h igh l ikel ihoods of dea th by one sever i ty m e t h o d

were viewed as hav ing m u c h lower r i sks by o the r m e t h -

ods. Pa t ien ts viewed differently by different sever i ty m e t h -

ods often exhib i ted different p reva lences of i m p o r t a n t

cl inical ind ica to rs of p n e u m o n i a risk. Therefore . s t ud i e s

of p n e u m o n i a mor ta l i ty could gene ra t e different f indings

d e p e n d i n g on wh ich sever i ty m e t h o d is u s e d for r isk ad-

j u s t m e n t . We could no t r e t u r n to medica l r eco rds to look in de-

tail a t cl inical cha rac te r i s t i c s of pa t i en t s scored very dif-

ferent ly by different sever i ty me thods . However, ou r ana l -

ys is involving cl inical f indings a s soc i a t ed wi th p n e u m o n i a

morta l i ty sugges ted tha t the two clinical d a t a - b a s e d m e t h -

ods (MedisGroups and the physiology score) were more

]GIM V o l u m e 11. J a n u a r y 1 9 9 6 9 9

Table 5. Prevalence of Clinical Findings Among Patients with Very Different Probabilities of Death Predicted by Pairs of Severity Methods, and Mean Number of Clinical Findings per Patient

Clinical Finding (CF) a Mean No. Severity Method No. of High No Low High Low Low Low CFs per

A B Patients RR HighT Coma Lethargy Disoriented NA BUN ALB pO2 pH Cancer Case

MedisGroups Physiology score 1,054 7.6* 97,6* 0.1 14,0" 1.5 21.4" 12 .6 10.3" 19 .6 11.1* 39.3* 2.3 Physiology score Med i sGroups 889 2.7 71.7 17.0" 5.3 9.0* 3.1 18.3" 2.4 43,8* 1,2 4,5 2.3

MedisGroups Disease staging 2,414 14.5" 93.5* 5.8* 13.7" 7.7* 14.5" 36.6* 12.7" 39.9* 12.6" 11.6 2.7* Disease staging MedisGroups 2.370 1.4 90.2 2.1 3.9 3.4 1.8 8.8 1.6 21.7 1.6 22.6* 1.5

MedisGroups PMC 2,525 16.5" 93.5* 7.8* 13.8" 6.5* 14.1" 35.2* 11.8" 41.0" 14.4" 22.9* 2.9 * PMC MedisGroups 2,526 1.1 90.1 1.9 4.1 3.6 2.7 9.8 2.3 19.9 1.0 13.8 1.5

MedisGroups APR-DRGs 2,432 15.7" 93.3* 8.0* 15.3" 8.6* 13.4" 36.6* 12.0" 36.7* 12.6" 25.4* 2.9* APR-DRGs MedisGroups 2,393 1.3 89.9 2.5 4.5 3.4 3.3 9.9 2.2 23.4 1.6 9.4 1.5

Physiology score Disease staging 2,363 12.7" 88.2 10.5 ~ 12.5" 9.3* 9.4* 37.2* 10.4" 46.0* 9.5* 7.9 2.8 ~ Disease staging Physiology score 2,417 2.6 95.5* 0.0 7.0 1.4 6.3 7.2 3.5 16.4 3.2 37.6* 1.6

Physiology score PMC 2,227 16.0" 88.3 13.0" 12.2" 9.2* 8.7* 37.2* 10,1" 50.5* ll.6" 15.7 3.0* PMC Physiology score 2,315 2.2 95.3* 0.0 6.6 1.5 6.4 7.6 3.9 13.7 2.0 24.0* 1.5

Physiology score APR-DRGs 2,210 14.7" 87.7 1 2 . 7 " 13.3" 11.3" 8.0* 38.6* 9.7* 44.8* 9.8* 17.1 2.9* APR-DRGs Physiology score 2,280 2,0 95.3* 0.0 7.0 1.0 7.9 7.4 4.5 17.1 3.3 17.8" 1.5

Disease staging PMC 1,909 8.4* 93.3* 5.3* 8.6 4.2 6.8 20.2 5.7 30.3* 7.4* 35.4* 2.2* PMC Disease staging 1.920 5.2 93. I 3.4 9.2* 6.0* 7.1" 23.5* 7.8* 26,0 3.8 8.8 1.9

Disease staging APR-DRGs 1,843 7.9* 92.9* 5.9* 9.4* 5.7* 5.9 20.7 6.0 23,2 5.3 42.2* 2.2* APR-DRGs Disease staging 1,890 6.8 92.3 4. l 8.3 4.6 8.9* 23.7* 8.7* 29.8* 5.4* 7.0 2.0

PMC APR-DRGs 1,142 6.3 92.5 4.1 8.9* 5.8* 4.7 20.3 6.9* 23.1 3.3 29.2* 2.0

APR-DRGs PMC 1,445 8.0* 93.3* 5.7* 8.7 3.2 8.2* 22.4* 6.7 35.0* 8.4* 14.5 2.2*

oAll clinical f i n d i n g s signif icantly increased the ad jus t ed odds ratio o f dy ing (see Table 4). Note that because a rectal temperature ~ 40°C signif icantly decreased the a d j u s t e d odds o f dying, this table looks at pa t i en t s wi thout a high rectal temperature CF. RR = respiratory rate; T = temperature; Na =

sodium: BUN = blood urea nitrogen: ALB = albumin. *Patients v i ewed a s more likely to die by sever i ty me thod A b a d a higher prevalence o J-the clinical f ind ing than pat ien ts v i ewed a s more likely to die b y me thod B. This pat tern o f clinical f i nd ings sugges t s that. wi thin pairs o f sever i ty methods . A is more clinically credible than B in a s s e s s i n g pat ient se-

veri ty on admiss ion .

clinically c red ib le m e a s u r e s of a d m i s s i o n sever i ty t h a n

t h e d i s c h a r g e a b s t r a c t - b a s e d m e t h o d s .

O u r f ind ings a lso s u g g e s t e d t h a t m a n y ICD-9-CM

codes ref lect cl inical d e r a n g e m e n t s a r i s ing af ter t h e f i rs t

two h o s p i t a l days . C o d e - b a s e d s y s t e m s a p p e a r e d to de-

p e n d heavi ly on t h e s e c o d e s in p r ed i c t i ng h igh l ike l ihoods

of d e a t h . Relying on s u c h c o d e s p e r h a p s improved the

s ta t i s t i ca l p e r f o r m a n c e (c a n d R 2) of t h e s e c o d e - b a s e

m e t h o d s , b u t t h i s p rac t i ce is a n a l o g o u s to p red ic t ing the

f u t u r e af ter a l ready k n o w i n g w h a t h a p p e n e d . For e x a m -

pie, if ICD-9-CM codes ind ica te t h a t r e s p i r a t o r y fai lure oc-

c u r r e d s o m e t i m e d u r i n g the hosp i t a l i za t ion , giving h i g h e r

sever i ty s c o r e s to p a t i e n t s wi th th i s code is s ta t i s t ica l ly

s o u n d : 38 .2% of p a t i e n t s wi th a n ICD-9-CM code for res -

p i ra tory failure died, c o m p a r e d wi th 8 .6% of t h o s e wi thou t .

Th i s s i t u a t i o n r a i s e s s e r i o u s q u e s t i o n s , however ,

a b o u t u s i n g d i s c h a r g e a b s t r a c t - b a s e d m e t h o d s to m a k e

i n f e r e n c e s a b o u t qua l i ty or e f fec t iveness of care. W h e n

j u d g i n g qual i ty u s i n g s e v e r i t y - a d j u s t e d d e a t h ra tes , one

s h o u l d a d j u s t only for p r eex i s t i ng cond i t ions , n o t t h o s e

a r i s ing af ter hosp i t a l i za t ion . 36 37 Othe rwise , a d j u s t i n g for

e v e n t s ( such a s r e sp i r a to ry failure) occu r r i ng late in t he

h o s p i t a l s t ay a n d poss ib ly c a u s e d by poor ca re m a y im-

p e d e de t ec t i on of d e a t h s d u e to poor qual i ty . For t h i s rea-

son , ou r a n a l y s i s of cl inical credibi l i ty focused on f ind ings

d e t e c t e d wi th in t he first two h o s p i t a l days ,

Never the less , d i s c h a r g e a b s t r a c t d a t a have t h e signif-

i can t a d v a n t a g e of be ing readi ly avai lable. A b s t r a c t i n g

cl inical i n fo rma t ion f rom medica l r e c o r d s is expens ive , an

i m p o r t a n t c o n s i d e r a t i o n in s t a t ewide a n d regional ini t ia-

t ives to c o m p a r e hosp i t a l p e r f o r m a n c e , a 4 For example , in

1994, Iowa dec ided t h a t M e d i s G r o u p s ( m a n d a t e d for

l a rger h o s p i t a l s t h r o u g h o u t t he s t a t e for severa l years)

w a s too expens ive . Iowa r ep l aced M e d i s G r o u p s wi th APR-

DRGs, largely on the b a s i s of lower d a t a cos t s .

In t h i s contex t , one i m p o r t a n t f ind ing is t he s imi lar i ty

of r e s u l t s p e r t a i n i n g to M e d i s G r o u p s (based on n u m e r o u s

cl inical d a t a e l emen t s ) a n d the phys io logy score (based on

17 var iables) . T h e s e two m e t h o d s h a d c o m p a r a b l e s t a t i s -

t ical p e r f o r m a n c e a n d a s s i g n e d p a t i e n t s s imi la r p red ic t ed

p robab i l i t i e s of dea th . They a p p e a r e d to have c o m p a r a b l e

cl inical credibi l i ty (i.e., n e i t h e r s y s t e m d o m i n a t e d the clin-

ical f ind ing analysis) : th i s s imi lar i ty w a s no t s u r p r i s i n g

given t h a t t hey rely on m a n y of t he s a m e cl inical va r i ab les

for sco r ing pa t i en t s . We i n c l u d e d phys io logy s co re s no t

specif ical ly to e x a m i n e APACHE itself, b u t b e c a u s e of in-

c r e a s i n g i n t e r e s t in c r ea t ing " m i n i m u m clinical d a t a se ts"

c o n t a i n i n g smal l n u m b e r s of wel l -se lec ted , phys io logic

30 Iezzoni et al.. Predicting Death for Pneumonia Inpatients JGIM

v a r i a b l e s . A P A C H E w e i g h t s a r e o n e w a y to u s e t h e s e m i n -

i m u m p h y s i o l o g i c v a r i a b l e s , b u t t h e r e a r e c e r t a i n l y o t h e r

a p p r o a c h e s . T h e M e d i s G r o u p s d a t a a b s t r a c t i o n p r o t o c o l

c o l l e c t s a g e n e r i c s e t o f m o r e t h a n 2 0 0 p o t e n t i a l K C F s re -

g a r d l e s s o f p a t i e n t d i a g n o s i s .

T h i s s t u d y h a s i m p o r t a n t l i m i t a t i o n s . R e s u l t s m a y

v a r y for c o n d i t i o n s o t h e r t h a n p n e u m o n i a , a8, a9 T h e 1 9 9 2

M e d i s G r o u p s C o m p a r a t i v e D a t a b a s e c o n t a i n s i n f o r m a -

t i o n o n l y f r o m M e d i s G r o u p s u s e r s o r f r o m h o s p i t a l s i n

s t a t e s r e q u i r i n g M e d i s G r o u p s . I n d e p e n d e n t i n f o r m a t i o n

a b o u t d a t a r e l i a b i l i t y w a s n o t a v a i l a b l e . T h e c l i n i c a l i n f o r -

m a t i o n i n t h e d a t a s e t w a s s p e c i f i c a l l y g a t h e r e d for M e d i s -

G r o u p s s c o r i n g , g i v i n g M e d i s G r o u p s a p o s s i b l e a d v a n t a g e

i n e v a l u a t i n g s t a t i s t i c a l p e r f o r m a n c e a n d t h e c l i n i c a l f i n d -

i n g a n a l y s i s . F o r e x a m p l e , t h e M e d i s G r o u p s a l g o r i t h m for

r a t i n g p n e u m o n i a s e v e r i t y o n a d m i s s i o n e x p l i c i t l y c o n s i d -

e r s h i g h B U N , l o w s o d i u m , c o m a , h i g h t e m p e r a t u r e , a n d

h i g h r e s p i r a t o r y r a t e . 14 T h e s e m e t h o d s a r e p e r i o d i c a l l y r e -

v i s e d ; n e w e r v e r s i o n s m a y p r o v i d e d i f f e r e n t r e s u l t s .

T h e M e d i s G r o u p s d a t a c o n t a i n e d i n f o r m a t i o n o n o n l y

i n - h o s p i t a l d e a t h s . P o s t - d i s c h a r g e m o r t a l i t y i n f o r m a t i o n

is u s e f u l b e c a u s e i t p e r m i t s h o l d i n g t h e w i n d o w o f o b s e r -

v a t i o n c o n s t a n t [e.g. , a t 3 0 d a y s f o l l o w i n g a d m i s s i o n ) .

T h i s i s c r i t i c a l l y i m p o r t a n t to c o m p a r e m o r t a l i t y o u t c o m e s

a c r o s s p r o v i d e r s w i t h d i f f e r i n g d i s c h a r g e p r a c t i c e s . 4°

H o w e v e r , o u r g o a l w a s n o t to c o m p a r e d e a t h r a t e s a c r o s s

h o s p i t a l s . W e h a v e n o r e a s o n to e x p e c t t h a t o u r o v e r a l l

f i n d i n g - - t h a t d i f f e r e n t s e v e r i t y m e t h o d s r a n k e d m a n y p a -

t i e n t s d i f f e r e n t l y b y p r o b a b i l i t y o f d e a t h - - w o u l d c h a n g e i f

w e h a d l o o k e d a t 3 0 - d a y m o r t a l i t y .

O u r f i n d i n g s s u g g e s t t h a t a n a l y s e s o f d e a t h s f r o m

p n e u m o n i a n e e d to b e s e n s i t i v e to t h e m e t h o d u s e d for

r i s k a d j u s t m e n t . C h o o s i n g t h e a p p r o p r i a t e s e v e r i t y m e t h o d

r e q u i r e s a v a r i e t y o f c o n s i d e r a t i o n s , i n c l u d i n g t h e u n d e r -

l y i n g c o n c e p t u a l f r a m e w o r k , s t a t i s t i c a l p e r f o r m a n c e , a n d

a s s e s s m e n t s o f c l i n i c a l c r e d i b i l i t y for t h e i n t e n d e d p u r -

p o s e . P h y s i c i a n s n e e d to b e i n v o l v e d i n e v a l u a t i n g t h e s e

m e t h o d s b e c a u s e p h y s i c i a n s , u l t i m a t e l y , wil l b e t h e o n e s

w h o h a v e to a d d r e s s t h e i r r e s u l t s .

We thank Jennifer Daley, ME). for assisting in the choice of clin- ical finctings for analysis.

REFERENCES

1. United States General Accounting Office; Health, Education. and H u m a n Services Division. Health care reforn~, "Report cards" are useful but significant i ssues need to be addressed. GAO/HEHS- 94-219. Washington, DC: United States General Accounting Of- rice, 1994.

2. United States General Accounting Office; Health, Education. and H u m a n Services Division. Employers urge hospitals to battle costs using performance data. GAO/HEHS-95-1. Washington, DC: United States General Accounting Office. 1994.

3, Iezzoni LI, Shwartz M, Restuccia J. The role of severity information in heal th policy debates: a survey of state and regional concerns. Inquiry. 1991;28:117-28.

4. Iezzoni LI, Greenberg LG. Widespread a s s e s s m e n t of r isk-adjusted outcomes: lessons from local initiatives. Joint Comm J Qual Im-

prov. 1994:20:305--16. 5. Localio AR, Hamory BH, Sharp TJ, Weaver SL, TenHave TR, Lan-

dis JR. Comparing hospital mortality in adult patients with pneu- monia. A case s tudy of statistical methods in a managed care pro- gram. Ann Intern Med, 1995:122:125-32.

6. Epstein A, Performance reports on quality--prototypes, problems, and prospects. N Engl J Med. 1995:333:57-61.

7. Selker HP. Systems for comparing actual and predicted mortality rates: characteristics to promote cooperation in improving hospi- tal care. Ann Intern Med. 1993:118:820-2.

8. Kassirer JP. The use and abuse of practice profiles. N Engl J Med. 1994;330:634-6.

9. U.S. Congress, Office of Technology Assessment . Identifying health technologies that work: searching for evidence. OTA-H-608. Washington. DC: U.S. Government Printing Office. 1994.

1O. McMahon LF, Billi JE. Measurement of severity of illness and the Medicare prospective payment system: state of the art and future directions. J Gen Intern Med. 1988;3:482 90.

11. The Quality, Measurement and Management Project. The Hospital Administrator 's Guide to Severity Measurement Systems. Chicago: The Hospital Research and Educational Trus t of the American Hospital Association, 1989.

12. Iezzoni LI, ed. Risk Adjustment for Measuring Health Care Out- comes. Ann Arbor, MI: Health Administration Press, 1994.

13. Iezzoni LI. 'Black box' medical information systems: a technology needing assessment . JAMA. 1991;265:3006-7.

14. [ezzoni LI. Dimensions of risk. In: lezzoni LI, ed. Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor, MI: Health Ad- ministration Press. 1994:29-118,

15. U.S. Department of Health, Education and Welfare: National Corn- mittee on Vital and Health Statistics. Uniform Hospital Discharge Data Mininmm Data Set, DHWQ Pub. No [PHS) 80-1157. Hyatts- ville. MD: U.S. Department of Health, Education and Welfare.

1980. 16. lezzoni LI. Data sources and implications: administrative data

bases. In: lezzoni LI, ed. Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor. MI: Health Administration Press, 1994; 119-75.

17. Brewster AC, Karlin BG, Hyde LA, Jacobs CM, Bradbury RC, Chae YM. MEDISGRPS: a clinically based approach to classifying hospi- tal patients at admission. Inquiry. 1985:12:377-87,

18. Iezzoni LI, Moskowitz MA. A clinical a s s e s smen t of MedisGroups. JAMA. 260:3159-63.

19. Steen PM. Brewster AC. Bradbury RC. Estabrook E. Young JA. Predicted probabilities of hospital death as a measure of admis- sion severity of illness. Inquiry. 1993:30:128-41.

20. Knaus WA, Wagner DP, Draper EA, et al. The APACHE III prognos- tic system: risk prediction of hospital mortality for critically ill hospitalized adults. Chest. 1991 : 100:1619-36.

21. Knaus WA, Wagner DP, Z immerman JE, Draper EA. Variations in mortality and length of stay in intensive care units, Alln Intern Med, 1993; 1 t8:753-61.

22. Gonnella JS, Hornbrook MC, Louis DZ. Staging of disease: a case- mix measurement . JAMA. 1984:251:63~44.

23. Markson LE. Nash DB, Louis DZ, Gonnella JS. Clinical outcomes management and disease staging. Eval Health ProL 1991 : 14:201-

27, 24. Young WW, Kobler S. Kowalski J. PMC patient severity scale: der-

ivation and validation. Health Serv Res. 1994;29:367-90. 25. All Patient Refined Diagnosis Related Groups: Definition Manual.

Wallingford, CT: 3M Health Information Systems, 1993. 26. Edwards N, Honemann D. Burley D, Navarro M. Refinement of

Medicare diagnosis-related groups to incorporate a measure of se- verity. Health Care Fin Rev, 1994: t6:45-64.

27. lezzoni LI. Shwartz M. Ash AS, Hughes JS. Daley J, Mackiernan YD, Severity measu remen t methods and predicting pneumonia deaths. Med Care (in press).

28. Iezzoni LI, Hotchkin EK, Ash AS, Shwartz M, Mackiernan Y. Me-

IG~M Votume t 1, January 1996 31

disGroups databases: the impact of data collection guidelines on predicting in-hospital mortality. Med Care. 1993;31:277-83.

29. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985; 13:818-29.

30. Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3:143-52.

31. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982; 143:29-36.

32. Hanley JA, McNeil BJ. A method of comparing the area under re- ceiver operating characteristic curves derived from the same cases. Radiology. 1983; 148:839-43.

33. Daley J. Validity of risk-adjustment methods. In: Iezzoni LI, ed. Risk Adjustment for Measuring Health Care Outcomes. Ann Ar- bor, MI: Health Administration Press, 1994;254.

34. Fine EM, Hanusa BH, Lave JR, et al. Comparison of severity of ill- ness measures in patients with community acquired pneumonia. J Gen Intern Med. 1995; 10:35948.

35. Daley J, Jencks S, Draper D, Lenhart G, Thomas N, Walker J. Pre-

dicting hospital-associated mortality for Medicare patients: a method for patients with stroke, pneumonia, acute myocardial in- farction and congestive heart failure. JAMA. 1988;260:3622-4.

36. lezzoni LI, Foley SM, Heeren T, et al. A method lor screening the quality of hospital care using administrative data: preliminary val- idation results. Qual Rev Bull. 1992;18:361-71,

37. Shapiro MF, Park RE, Keesey J, Brook Rid. The effect of alterna- tive case-mix adjustments on mortality differences between mu- nicipal and voluntary hospitals in New York City. Health Serv Res.

1994:29:95-112. 38. Iezzoni LI, Shwartz M, Ash AS, et al. Evaluating Severity Adjustors

for Patient Outcome Studies. Final Report. Beth Israel Hospital, Boston, May 10, 1995. Prepared for the Agency for Health Care Policy and Research under grant RO 1 ~HS06742.

39. Iezzoni LI, Ash AS, Shwartz M, Daley J, Hughes JS, Mackiernan YD. Predicting who dies depends on how severity is measured: im- plications for evaluating patient outcomes. Ann Intern Med

1995; 123:763-70. 40. Jencks SF, Williams DK. Kay TL, Assessing hospital-associated

deaths from discharge data: the role of length of stay and comor-

bidities. JAMA. 1988;260:224(~6.

e

ANNOUNCEMENT

American Board of Internal Medicine

1996 Certification Examinations in Clinical Cardiac Electrophysiology, Hematology,

Infectious Disease, Nephrology, Pulmonary Disease and Rheumatology

R e g i s t r a t i o n Per iod:

E x a m i n a t i o n Da te :

J a n u a r y 1, ] 9 9 6 - A p r i l 1, 1996

N o v e m b e r 20, 1996

F o r m o r e i n f o r m a t i o n a n d a p p l i c a t i o n f o r m s , p l e a s e c o n t a c t :

R e g i s t r a t i o n S e c t i o n

A m e r i c a n B o a r d o f I n t e r n a l Med ic ine

3 6 2 4 M a r k e t S t r e e t

P h i l a d e l p h i a , PA 1 9 1 0 4

T e l e p h o n e : (800) 4 4 1 - 2 2 4 6 • (215) 2 4 3 - 1 5 0 0

Fax: {215) 3 8 2 - 5 5 1 5