six sigma healthcare dea final paper

7/28/2019 Six Sigma Healthcare DEA Final Paper

1/27

Int. J. Six Sigma and Competitive Advantage, Vol. 4, No. 3, 2008 305

Data envelopment analysis models for identifying andbenchmarking the best healthcare processes

James C. Benneyan*

Department of Mechanical and Industrial Engineering

Northeastern University

360 Huntington Avenue, Boston, MA 02115, USA

Fax: 6173732921

E-mail: [email protected]

*Corresponding author

Aysun Sunnetci and Mehmet Erkan Ceyhan

Northeastern University

334 Snell Engineering Center

Boston, MA 02115, USA



Abstract: We illustrate the use of Data Envelopment Analysis (DEA) modelswithin process improvement work for identifying and benchmarking the besthealthcare systems, in terms of most efficiently producing desirable outcomesfrom consumed resources. This approach is useful when comparing severalsystems that use multiple types of inputs (e.g., operating costs, clinicians, staff)to produce multiple outputs (e.g., outcomes, satisfaction, access), such as thosecommonly found in balanced scorecards and dashboard datasets, and providesthe analyst with relative scores and rankings for each system, targets for eachmeasure that would move inefficient systems to the best performance frontier,and a list of other systems to benchmark and emulate in order to improve.Modified DEA models are proposed to address four common issues thatfrequently arise in such contexts, including rationally constraining the weightsgiven to each measure and handling missing, estimated or proportional data(such as adverse event or mortality rates). These models can be used tocompare hospitals, departments, national healthcare systems, and regional orstate systems and are useful to help understand how to improve sub-optimalprocesses and set feasible targets. This approach is illustrated at department,hospital, state, and country levels, with overall results showing very littlecorrelation with less quantitative benchmarking studies.

Keywords: benchmarking; healthcare; data envelopment analysis; DEA;weight restrictions; proportional data; hyper-efficiency.

Reference to this paper should be made as follows: Benneyan, J.C.,Sunnetci, A. and Ceyhan, M.E. (2008) Data envelopment analysis models foridentifying and benchmarking the best healthcare processes, Int. J. Six Sigmaand Competitive Advantage, Vol. 4, No. 3, pp.305331.

Copyright 2008 Inderscience Enterprises Ltd.


2/27

306 J.C. Benneyan, A. Sunnetci and M.E. Ceyhan

Biographical notes: James C. Benneyan, PhD, is an Associate Professor of

Industrial Engineering and Operations Research and the Director of the Qualityand Productivity Laboratory at Northeastern University, USA, a faculty forthe Institute for Healthcare Improvement and Advisor to several nationalhealthcare improvement projects. Previously, he was a Senior SystemsEngineer at Harvard Community Health Plan and is a past President andFellow of the Society for Health Systems. His research areas include statisticalmethods for quality improvement, healthcare systems engineering andoperations research in nanotechnology.

Aysun Sunnetci received her PhD in Industrial Engineering from NortheasternUniversity in Boston, Massachusetts, USA. Her research addresses two DEAproblems that frequently arise in practice: the handling of proportional data inconstant-returns-to-scale models and methods for constraining the weighting ofdecision variables within these models.

Mehmet Erkan Ceyhan is a PhD candidate in Industrial Engineering atNortheastern University in Boston, Massachusetts, USA. His research focuseson estimated proportions and ranked data in DEA and benchmarking analysisof national healthcare systems.

1 Introduction

In quality improvement and six sigma activities, benchmarking serves an important role

for identifying best practices, understanding deficiencies, and setting targets (Burstin

et al., 1999). First employed by Xerox in the 1970s, benchmarking has become

a common business practice for supporting continuous process improvement andmanagement decision making (McNair and Leibfried, 1992). In the classic Six Sigma

Define, Measure, Analyse, Improve, Control (DMAIC) approach, for example,

benchmarking can contribute to the measurement, analysis, and improvement activities.

This paper discusses and illustrates the use of Data Envelopment Analysis (DEA) for

benchmarking healthcare systems within these types of process improvement or Six

Sigma contexts. The intent is to illustrate how DEA can be used within these contexts

through a variety of examples, rather than provide a comprehensive review of DEA

theory or detailed results of each study; where appropriate references are provided to

such information and to each of the cited studies.

In general, benchmarking activities compare processes across organisations

(Stevenson, 1998), including efforts to identify potential comparison partners, understand

relative strengths and weaknesses, identify areas for improvement, determine gaps, and

set goals (Collins-Fulea et al., 2005). An experience of Westinghouses Electric Systems

Division is a successful example, becoming world class in part by adapting better

processes for material handling from Texas Instruments, subcontracting from Boeing, and

work team organisation from Rockwell. Within healthcare, a study by Solucient found

that annually 57 000 additional patients would survive, 18% fewer medical complications

would occur, average hospital lengths of stay would decrease significantly, and $9.5

billion would be saved if all hospitals in the USA performed as well as the best hospitals

(Chenoweth, 2003).


3/27

DEA models for identifying and benchmarking the best healthcare processes 307

Despite the clear value of identifying and transferring best practices, many

benchmarking approaches are fairly subjective in the manner by which they weight

performance metrics and determine top performers, often including qualitative

comparisons, questionnaires and surveys, expert assessments, and case study

comparisons. Often some type of score is computed for each organisation by applying

largely subjective weights or ranks to various measures, as described below.

Additionally, while most benchmarking tools identify an organisations relative strengths,

few provide additional information to help the underperforming organisations improve,

set goals, and identify peer organisations to emulate. In contrast, DEA is a quantitative

optimisation method for comparing entities (called Decision-making Units or DMUs)

in order to mathematically determine their relative efficiencies, assign weights to each

variable, set targets, and identify the best DMUs for further study.

2 Methodology

2.1 Efficiency frontier model

Originally developed within the operations research and econometrics communities, data

envelopment analysis is a mathematical method, based in linear programming models, for

comparing the relative efficiencies of multiple decision-making units at transforming the

multiple types of inputs each consumes into the types of multiple outputs each produces

(Charnes et al., 1981; Cooper et al., 2000). Inputs might include the number of clinical

staff, nurse-to-patient levels, and operating costs per patient day, whereas outputs might

include clinical outcomes, access, patient satisfaction, and safety. DEA mathematically

compares these measures across all DMUs in order to construct a production efficiency

or best-practice frontier consisting of those organisations that achieve the best weightedcombination of maximal outputs from minimal inputs (Medina-Borja et al., 2007).

Conceptually, the most efficient DMUs define an efficiency frontier that envelops

all the other DMUs, as illustrated graphically in Figure 1 for a simple one-input

one-output case, where the horizontal and vertical axes correspond to input and output

levels, respectively. In this example, the DMUs labelled A, B, and C comprise the

Variable Returns-to-Scale (VRS) frontier, whereas the inefficient DMUs D, E, and F can

reach this frontier by producing either the same amount of output with less input or more

output with same amount of input. Mathematically, a set of fractional optimisation

programmes based on total weighted output-over-input ratios (transformed into Linear

Programmes (LPs) for ease of solution) is solved to determine four results for each DMU:

1 an overall efficiency score between 0 and 1 relative to the other DMUs (where a

score of 1 indicates the DMU (e.g., facility) is on the frontier)

2 optimal weights for each input and output that maximise this score relative to those

of all other DMUs

3 target values for each input and output that would move this DMU onto the

efficiency frontier (if not currently best-in-class)

4 a subset list of the other DMUs that form a reference set for further study and

benchmarking (where becoming a weighted combination of these DMUs would

move a non-frontier entity to best in class).


4/27


Figure 1 Example of constant and variable returns-to-scale efficiency frontiers

A

B C

D

E

F

input

output CRS frontier

VRSfrontier

These LPs are solved iteratively, once for each DMU, along with first and second phase

dual models (Cooper et al., 2000) to produce the results described above. While several

different formulations exist, in general all DEA models seek to maximise the ratio of a

weighted sum of outputs over a weighted sum of inputs, as shown in Table 1, where Kis

the number of DMUs,Mis the number of outputs,Nis the number of inputs, and e is the

current DMU being measured. DEA models can assume either Constant Returns-to-Scale

(CRS) or VRS in the relationship between inputs and outputs, as illustrated in Figure 1,

and can be input- or output-oriented. An input-oriented model aims to minimise the level

of inputs while producing the same level of outputs, whereas an output-oriented model

aims to maximise the level of outputs while consuming the same level of inputs. These

two orientation formulations identify the same efficient and inefficient DMUs (in the

CRS case with the output-oriented efficiency score equal to the reciprocal of that of the

input-oriented model), but with different targets, weights, and reference sets.

Table 1 Constant Returns to Scale (CRS) input and output-oriented DEA models (in fractionalprogramme form)

Input oriented CRS model Output oriented CRS model

1

1

1

1

maximise

subject to 1 1

0 1

10

M

j je

j

e N

i ie

i

M

j jk

j

N

i ik

i

j

i

u O

z

v I

u O

k ,...,K

v I

u j ,...,M

i ,...,N v

=

=

=

=

=

=

=

=

1

1

1

1

minimise

subject to 1 1

0 1

10

N

j ie

ie M

j je

j

N

j ik

i

M

j jk

j

j

i

v I

z

u O

v I

k ,...,K

u O

u j ,...,M

i ,...,N v

=

=

=

=

=

=

=

=


5/27


2.2 A simple example

Figure 2 illustrates the general framework of a typical DEA study, here comparing six

hypothetical hospitals each with three inputs (cost/charge ratio, FTE/bed ratio, and a

case-mix adjusted average length of stay index) and three outputs (adjusted mortality,

patient satisfaction, and access) that might be desirable to minimise and maximise,

respectively (since outputs are maximised in DEA, mortality rates were converted to

non-mortality rates). As shown, Hospitals 4 and 6 are top ranked and on the frontier (with

scores of 1.0) and appear in the peer benchmark sets of the other four hospitals, implying

that the others can improve by studying and emulating them. The next step in such an

analysis would be to conduct an in-depth study of these two hospitals to develop insights

as to how they are able to perform better. In order to illustrate this approach and the

breadth of uses of DEA within healthcare, several recent studies are summarised below,

at times using modified models as described in the following section.

Figure 2 Illustrative example of DEA analysis of six hypothetical hospitals

.02.03.012

days7.98H6

.25.30.2020

days30.35H5

.10.05.03

14

days24.47H4

.4.12.1010

days20.83H3

.25.075.0254

days14.85H2

.20.25.153

days8.95H1

Adj

LOS

index

FTE /

bed

ratio

Cost /

charge

ratio

Acce

ss

Pat.

Sat.

Adj

Mort

ality

InputsOutcomes

Hospi

tal

.02.03.012

days7.98H6

.25.30.2020

days30.35H5

.10.05.03

14

days24.47H4

.4.12.1010

days20.83H3

.25.075.0254

days14.85H2

.20.25.153

days8.95H1

Adj

LOS

index

FTE /

bed

ratio

Cost /

charge

ratio

Acce

ss

Pat.

Sat.

Adj

Mort

ality

InputsOutcomes

Hospi

tal

3611.000H6

0440.571H5

4411.000H4

04, 650.414H3

06, 430.734H2

04, 660.134H1

Freq

bench-

marked

PeersRankScore

DEA Results

Hospital

3611.000H6

0440.571H5

4411.000H4

04, 650.414H3

06, 430.734H2

04, 660.134H1

Freq

bench-

marked

PeersRankScore

DEA Results

Hospital

Note: Mortality is converted to non-mortality in order to be a larger is better output.

2.3 Model extensions

Two modelling issues that frequently arise in many healthcare applications include

proportional data (often estimated or missing) and irrational weights computed for

some measures. In the first case, many key healthcare data are proportions bound

between 0 and 1 (such as mortality, infection, adverse event, and appointment access

rates), violating the usual DEA assumption that all data can take any positive value.

Similar data also arise in other industries, such as defect, graduation, and customer

retention rates. Scalar data bound on a fixed interval present a similar problem, such as

patient satisfaction scores between 1 and 5 or life expectancies, as opposed to being

unbounded above.

Solving conventional CRS models in such cases theoretically can produce

nonsensical target values that exceed their upper possibilities (e.g., 130% survival or 420

years life expectancy). Borrowing an idea from logistic regression (Amemiya, 1985),


6/27


a simple Odds-Ratio (OR) transformation instead can be used to ensure all targets lie

within their logical bounds, converting each proportion p on the (0,1) interval to apositive real number odds ratiop/(1 p), offering the modeller an easy alternative when

VRS relationships are not appropriate; notationally:

,

,

,1

k jOR

k j

k j

II

I=

(1)

and

,

,

,

,1

k iOR

k i

k i

OO

O=

(2)

for the proportional j-th inputIk,j or i-th output Ok,i of DMU k, where now ,

and , Substituting these odds-ratios for all proportional inputs and outputs,DEA models can be solved in the usual manner, with the resultant odds-ratio targets

0 ORk i

O< <

.0 < < ORk jI*

,

OR

k jI

and then back-transformed to proportional targets*,

OR

k iO

*

,k jI and as:*

,k iO

*

, *

,

1

1k j OR

k j

II

=+

(3)

and

*

, *

,

1.

1k i OR

k i

OO

=+

(4)

The impact of this approach on efficiency scores, weights, reference sets, peer weights,

and targets is illustrated below and explored in greater detail by Benneyan and Sunnetci(2008). Approaches to the related modelling problems of non-proportional data bound on

an (a, b) interval (such as ratings between 1 and 10), estimated probabilities, and missing

data also are discussed by the above authors, Ceyhan and Benneyan (2008), Benneyan

et al. (2006), and Aksezer and Benneyan (2003), including multiple imputation,

bootstrapping, and Monte Carlo methods.

A second periodic problem that arises when using DEA to benchmark healthcare

systems is the production of irrational weights, such as placing greater weight on patient

satisfaction than on mortality (in the extreme case with zero weight essentially ignoring

important variables). Several possible modelling approaches to address this problem are

summarised in Table 2 and described below, the first two taken from the DEA literature,

along with their advantages and disadvantages.

The simplest approach is to rank order all weights via additional constraints that force

the desired relative ordering, e.g., u2 u4 or v1v2v3, although this typically produces

equal weights if the constraint would have been violated in the unbounded case. A second

frequent approach is to assign upper or lower bounds to weights, such as v1a or u2b

where a and b are some desired constants. Since weights mathematically are unbounded

above, however, these values are somewhat meaningless. An extension of this idea that

lends more meaning, however, is to specify or bound the percentages that each measure

can receive from the total weight given to all measures, e.g., ui = gi(u1 + u2 + uM)


7/27


or vjhj(v1 + v2 + vN), where g1 + g2 + gM = h1 + h2 + hN = 1 (Sunnetci and

Benneyan, 2008). These Percent-of-Total (POT) constraints can be limited further to

desired ranges, i.e., aigibiand cjhjdj, or only specified for some of the weights.

Table 2 Possible weight-restricting approaches, advantages, and disadvantages

Approach Example Advantages Disadvantages

Simpleranking

u2u1

u2u3

Easily applied.

Prevents the problem ofallowing more weight onless important variables.

Does not prevent zero weights.

Typically produces equalweights (e.g., u2 = u1).

Lowerbounds

v1 0.43 u1 0.14

v2 0.33 u2 0.58

v3 0.18 u3 0.22

Prevents problems ofirrational ranking and

zero weights.

Lower bounds lack muchmeaning since weights are

unbounded above.Difficult to determine or agreeon arbitrary bounds.

Frequently a feasible solutioncannot be found (especially iflower bound >> zero).

Percentof Total(POT)

u1 = .25(u1 + u2 + u3)

u2 = .50(u1 + u2 + u3)

u3 = .25(u1 + u2 + u3)

Prevents bothabove problems.

Hard to determine specificpercentages, which are stillsomewhat subjective.

Notes: First two methods in literature, third method is proposed here.

Given that these POT gi and hj values also may be somewhat subjective, the fraction

of the entire possible (gi, hj) space can be identified for which each particular DMU is

efficient, referred to here as its hyper-efficiency score, with any DMUs on the frontier

for all possible values called hyper-efficient. These results can be identified or

estimated by iterative search, numerical methods, or a Monte Carlo scattering approach

that repeatedly solves the DEA models using random (gi, hj) values, somewhat measuring

a DMUs efficiency robustness using any set of weights. An alternate method to address

arbitrary weights, called cross-efficiency, computes the average efficiency score for each

DMU based only on the optimal weights of all other DMUs (Sexton et al., 1986; Doyle

and Green, 1994), in essence considering how efficient a DMU would be using (only) the

weights of the other DMUs.

In the below examples, all analyses were conducted using CRS output-oriented

models with all proportions and scalar data transformed to OR and all smaller-is-better

outputs (such as AE and mortality rates) subtracted from 1; weight-restrictions ormissing data imputation are noted when used and weighting robustness is measured via

hyper-efficiency.


8/27


3 Applications

3.1 Hospital benchmarking

To illustrate a basic DEA study and the above modelling approaches, Table 3 summarises

an analysis of 17 hospitals, where the provided inputs were the costs of administration

and support, information systems, supplies, lab and imaging, nursing, and ancillary

services and rehabilitation. The outputs of interest were various clinical outcome and

patient safety measures (surgery quality, Cesarean related quality, failure to rescue rates,

surgery adverse event rates, delivery adverse event rates, and post operation adverse

event rates) that also serve as surrogates for the overall process quality. For each hospital

in Table 3, the first, second, and third rows contain their current data, targets, and

weights, respectively.

As shown in the second column, seven hospitals are on the best-practice frontier (with

scores of 1.0 and targets equal to their original values since they already are the topperformers). The challenge for inefficient hospitals is to benchmark those on the frontier

(or find other ways to reduce their inputs and increase their outputs to their computed

targets) in order to become as good as those with scores of 1. For example, Hospital 1

would become top-ranked if it could change its inputs and outputs to the target

levels shown in its second row (i.e., reduce its administration and support costs from

$3,619 to $1,578 and its surgery adverse event rate from .0473 to .0003, along with the

other targets).

Also note that, as described above, the DEA model set several weights irrationally in

order to maximise some DMUs scores. Hospital 2, for example, has been made to appear

efficient by setting the weights equal to zero for inputs 1, 2, 5, and 6 and for outputs 1, 2,

3, 4, and 5, placing little to no weight on measures for which this hospital performs

poorly. Additionally, Cesarean related quality has been weighted significantly lower(by more than 90%) than surgery quality.

In a similar analysis, Table 4 summarises unrestricted DEA results for the US News

and World Report (USNWR) annual published study of the best US hospitals, which in

2007 placed 17 hospitals on an honour roll. As above, the first and second rows for each

hospital contain the targets and weights, whereas the first and second (in parentheses)

values in the score column are the DEA and USNWR scores, respectively, where the

USNWR results were computed using a subjective weighting scheme where structure,

process, and outcome measures each received one-third of the weight (McFarlane et al.,

2007). Duke University Medical Center, for example, would become top-ranked if

it could achieve the target values shown in its first row. Note again, however, that

some measures receive zero or irrational weights (highlighted in italics and grey

shading, respectively).

As shown in Figure 3, furthermore, in contrast to the USNWR scores (R2 = 0.8535,

p = 0.00000005), the hospital-wide and average department DEA scores (unrestricted)

have little correlation to each other (R2 = 0.1288, p = 0.1436). The DEA results were

calculated by solving a separate model for the hospitals and for each type of department,

as described below, with low and variable correlations between departments suggesting

process and practice differences within hospitals. The correlations in Table 5 summarise

these differences for both the DEA and USNWR results.


9/27


10/27


11/27


Table 4 Unrestricted CRS model results for US News and World Report honour roll

(hospital-wide)

Inputs Outputs

Hospital

DEA

(USNWR)

scoreNursing

index

Advanced

services

Patient

services Reputation

Non-

mortality Discharges

T 1.9000 51.0000 85.0000 33.5750 1.6925 32 127.00Johns

Hopkins

1

(1) W 0.1152 0.0153 0.0000 0.0298 0.0000 0.0000

T 2.8000 50.5000 85.0000 36.8000 1.5748 95 317.00Mayo Clinic 1

(0.967) W 0.0000 0.0198 0.0000 0.0272 0.0000 0.0000

T 2.4000 57.0000 49.0000 15.8917 1.7167 25 504.00UCLA 1

(0.833) W 0.0000 0.0108 0.0079 0.0130 0.4619 0.0000

T 2.0000 57.0000 75.0000 25.7750 1.4797 60 769.00Cleveland 1

(0.833) W 0.2163 0.0018 0.0062 0.0052 0.4369 0.0000

T 2.0000 55.0000 75.0000 25.2516 1.4600 62 920.37Mass General 0.8983

(0.767) W 0.2416 0.0020 0.0070 0.0058 0.4881 0.0000

T 1.7000 58.0000 85.0000 14.1750 1.3986 86 964.00NY

Presbyterian

1

(0.700) W 0.1708 0.0122 0.0000 0.0000 0.0000 0.00001

T 1.6000 51.0874 74.0000 12.6984 1.4425 44 194.96Duke

University

0.9122

(0.600) W 0.3697 0.0000 0.0068 0.0031 0.6356 0.0000

T 2.2000 56.0000 61.0000 16.6399 1.7133 27 323.44UCSF 0.8439

(0.600) W 0.2812 0.0042 0.0054 0.0013 0.6257 0.0000

T 2.1000 56.5000 85.0000 15.9637 1.6534 70 264.03Barnes-J. 0.8361

(0.567) W 0.2723 0.00790 0.0021 0.0000 0.5887 0.0000

T 2.3000 58.0000 75.0000 11.8118 1.8404 39 213.05Brigham &

Womens

0.9006

(0.533) W 0.2445 0.0030 0.0050 0.0000 0.5402 0.0000

T 2.1000 40.5000 64.9492 16.9696 1.5740 31 918.15U WA 0.9232

(0.500) W 0.2676 0.0129 0.0000 0.0000 0.6882 0.0000

T 1.5000 49.0000 71.0000 8.1300 1.4815 22 703.00U Penn 1

(0.367) W 0.3000 0.0112 0.0000 0.0000 0.6750 0.0000

T 1.9000 49.0000 71.0000 10.5761 1.4471 61 973.02U Pitt 0.8552

(0.333) W 0.2961 0.0037 0.0060 0.0000 0.6543 0.0000

T 2.4000 54.0000 77.0000 9.8572 1.8367 47 372.89U Mich 0.7630

(0.300) W 0.2829 0.0035 0.0057 0.0000 0.6252 0.0000

T 1.8000 47.0000 52.0000 10.9319 1.4307 22 887.03Stanford 0.9394(0.267) W 0.3095 0.0039 0.0063 0.0000 0.6840 0.0000

T 2.5000 36.0000 55.0000 4.7000 1.6453 35 234.00Yale 1

(0.267) W 0.0000 0.0278 0.0000 0.0135 0.5692 0.0000

T 2.0000 47.0000 70.0000 10.0143 1.4842 59 521.52Cedars 0.9836

(0.233) W 0.2606 0.0076 0.0020 0.0000 0.5634 0.0000

T 2.3000 50.0000 63.0000 9.8120 1.7339 30 351.05U Chicago 0.8998

(0.233) W 0.2454 0.0048 0.0049 0.0000 0.6410 0.0000

Notes: T = Target, W = Weight.


12/27


Figure 3 Comparison of hospital-wide and average department scores for US News and World

Report versus DEA results (see online version for colours)

0

0.25

0.5

0.75

1

0 0.25 0.5 0.75 1Hospital-Wide Score

AverageDepartmentScore DEA (R2 = 0.1288,p = 0.1436)

USNWR (R2 = 0.8535,p = 0.00000005)

Table 5 Cross-correlations of department DEA results

Department

Cancer

Digestive

Disorders

Ear-Nose-

Throat

Endo-

crinology

Geriatrics

Gynecology

Heart

Kidney

Disease

Neurology&

Neuro-

surgery

Ortho-

pedics

Respiratory

Disorders

Urology

1

(1)

-.0006 1

(.6746) (1)

.1702 .6122 1

(.6706) (.2063) (1)

.1603 .4651 .5467 1

(.5808) (.8146) (.1478) (1)

.1696 .4605 .5949 .5536 1

(.5370) (.4046) (.3652) (.2751) (1)

.0238 .3980 .4095 .3302 .2712 1

(.5261) (.5153) (.3390) (.3552) (.3446) (1)

.1765 .8165 .5244 .6013 .7360 .5159 1

(.3056) (.7670) (.1305) (.4790) (.0808) (.4369) (1)

-.0454 .6316 .4235 .6618 .2712 .6537 .7670 1

(.2399) (.6810) (.0218) (.6145) (.3945) (.5923) (.6937) (1)

.3067 .7383 .5667 .5279 .7086 .3486 .8860 .6145 1

(.4886) (.7321) (.2420) (.8259) (.3349) (.4283) (.5482) (.7138) (1)

.1885 .7623 .3357 .0245 .2752 .2552 .8345 .4590 .6260 1

(.4832) (.9198) -(.0365) (.8378) (.2692) (.2466) (.7511) (.6310) (.6808) (1)

.1353 .0874 .3291 .3770 .4613 .5782 .3691 .4175 .5010 .0612 1

(.6736) (.7389) (.4653) (.8333) (.3572) (.5441) (.6285) (.6993) (.8183) (.7823) (1)

-.0474 .7171 .4553 .3515 .3999 .1321 .6990 .4585 .4662 .5491 -.0496 1

(.6142) (.6839) (.5150) (.4057) (.5271) (.5200) (.6760) (.5987) (.6904) (.4656) (.6687) (1)

Geriatrics

Gynecology

Heart

Urology

KidneyDisease

Neurology &

Neurosurg

Orthopedics

Respiratory

Disorders

Cancer

Digestive

Disorders

Ear-Nose-

Throat

Endo-

crinology

Note: USNWR results shown in parentheses.

3.2 Department benchmarking

Since multiple departments contribute to the overall performance of a hospital, separate

benchmarking across each specialty also can be useful. For example, Sunnetci and

Benneyan (2008) applied conventional and weight-restricted DEA models to the above

12 specialty departments examined by USNWR. For the sake of illustration, results


13/27


are presented here only for the Ear, Nose, and Throat (ENT) departments. Table 6 and

Table 7 summarise the best practice ENT departments found in that study (with DEA

scores equal to 1.0) and a subset of the full results for all ENT departments, respectively,

with the USNWR results shown in parentheses.

Table 6 All ENT departments with unrestricted DEA scores = 1

DEA best-practice ENT departments

Greater Baltimore Medical Center (.217)

Hospital of the University of Pennsylvania (.490)

Johns Hopkins Hospital (1.00)

Massachusetts Eye and Ear Infirmary (.601)

Mayo Clinic (.504)

Memorial Sloan-Kettering Cancer Center (.346)

Ochsner Clinic Foundation (.213)

Ohio State University Hospital (.308)

St. Johns Mercy Medical Center (.237)

University of Alabama Hospital at Birmingham (.287)

University of California San Francisco Medical Center (.403)

University of Kentucky Chandler Hospital (.223)

M.D. Anderson Cancer Center (.543)

Note: USNWR scores shown in parentheses.

Table 7 Subset of unrestricted DEA results for ENT departments

Inputs Outputs

Hospital

DEA

(USNWR)

score

Nursing

index

Advanced

services

Patient

services Reputation Non-mortality Discharges

Weights v1 v2 v3 u1 u2 u3

T 2.2000 3.0000 5.0000 9.9000 2.1277 189.0000UCSF 1

(0.403) W 0.4546 0.0000 0.0000 0.0957 0.0000 0.0003

T 2.0000 2.8194 6.0000 24.6032 2.5062 395.9797Cleveland 0.6869

(0.493) W 0.5502 0.0000 0.0592 0.0137 0.0000 0.0028

T 1.5000 1.5000 7.0000 2.3970 13.1487 277.1343Tampa 0.4114

(0.215) W 0.7423 0.7770 0.0217 0.0000 0.0000 0.0088

T 2.10000 2.50000 5.23694 25.20444 2.77828 382.06735Univ WA 0.4999

(0.428) W 0.530663 0.354384 0.000000 0.009028 0.000001 0.004640

T 1.90000 2.50000 7.00000 40.60000 1.88679 275.00000Johns

Hopkins

1

(1) W 0.22629 0.00000 0.08143 0.02463 0.00000 0.00000

T 1.50000 2.50000 6.00000 0.00000 1 000 000.00000 74.00000Ochsner 1

(0.213) W 0.29973 0.22016 0.00000 0.00497 0.00000 0.00276

Notes: T = Target, W = Weight.


14/27


These results again illustrate poor agreement with the USNWR findings and the common

problem of zero or irrational weights in unrestricted models (cells with italic font andgrey shading in Table 7). The University of Washington Medical Center (UWMC), for

example, places less weight on non-mortality than on reputation, whereas the University

of California Hospital in San Francisco (UCSF) places no weight on non-mortality,

advanced services, and patient services. Table 8 illustrates how results change when the

weight restriction approaches described in Section 2.3 were applied, using the bounds

shown in Column 2, with different hospitals being efficient based on the particular

approach and bound values used.

Table 8 Comparison of best-practice ENT departments for each weight-restricted model

Model Bounds used Best-practice hospitals

Basicordering

u2u

1

u2u3

Greater Baltimore Medical Center

Hospital of the University of Pennsylvania

Massachusetts Eye and Ear Infirmary

Memorial Sloan-Kettering Cancer Center

Ochsner Clinic Foundation

St. Johns Mercy Medical Center, St. Louis

M.D. Anderson Cancer Center

University of California, San Francisco Medical Center

Lowerbounds

v1 0.43 u1 0.14

v2 0.33 u2 0.58

v3 0.18 u3 0.22

Greater Baltimore Medical Center

Hospital of the University of Pennsylvania


Memorial Sloan-Kettering Cancer CenterOhio State University Hospital


UCLA Medical Center

University of Alabama Hospital at Birmingham


University of Kentucky Chandler Hospital

POT u1 = 0.25(u1 + u2 + u3)

u2 = 0.50(u1 + u2 + u3)

u3 = 0.25(u1 + u2 + u3)


Memorial Sloan-Kettering Cancer Center

Ochsner Clinic Foundation

Ohio State University Hospital




15/27


Table 9 Hyper-efficiency results for ENT departments (USNWR), based on 1000 replications

Standard

deviation

0.1778

0.1842

0.1090

0.0924

0.1352

0.3295

0.0860

0.1296

0.1187

0.1213

0.0991

0.0782

0.0784

0.04863

0.0878

0.0839

0.0930

0.2038

0.1135

0.1120

0.1063

0.1207

0.0509

0.1127

0.1896

Average

efficiency

0.53

0.56

0.03

0.30

0.13

0.35

0.08

0.13

0.06

0.06

0.04

0.02

0.22

0.15

0.03

0.02

0.03

0.58

0.63

0.04

0.39

0.75

0.39

0.03

0.50

Percen

t

efficient(

%)

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Hospital

Mount-Sinai

Presbyterian

OhioState

Oregon

Rush

Shands

St.Francis

St.Josephs

Stanford

Tampa

UCLA

Alabama

California

SanDiego

Chicago

Iowa

Kentucky

JacksonMemorial

Michigan

Minnesota

NorthCarolina

Pittsburgh

Washington

Vanderbitt

Yale

Standard

deviation

4.5736E-12

0.3538

0.1624

0.1805

0.2415

0.2425

0.1932

0.2045

0.3262

0.1218

0.1065

0.1148

0.1526

0.4030

0.0734

0.0545

0.2038

0.0730

0.1161

0.1006

0.1413

0.2982

0.2275

0.4553

0.1169

Average

efficiency

1.00

0.69

0.55

0.80

0.84

0.53

0.73

0.21

0.36

0.31

0.29

0.28

0.38

0.44

0.56

0.34

0.21

0.60

0.50

0.44

0.35

0.32

0.23

0.40

0.11

Percent

efficient(%)

100.00

27.73

2.40

1.51

0.77

0.30

0.10

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Hospital

Ochsner

Sloan-Kettering

SanFranMedicalCenter

M.D.Anderson

MassEyeandEar

St.JohnsMercy

Pennsylvania

Advocate

Barnes-Jewish

BethIsrael

Brigham

Charleston

Christiana

Clarian

Cleveland

Cullen

Duke

Emory

Baltimore

H.LeeMoffitt

St.Raphael

JohnsHopkins

MassGeneral

Mayo

Miami


16/27


Figure 4 Comparison of DEA and USNWR department scores (unrestricted models) (see online

version for colours)

Ca

ncer

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Di g

estiveDis

orders

0

0.

2

0.4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Hea

rt

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Kidn

eyDiseas

e

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Ho

spital

Score

Ear-Nose-Throa

t

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Endo

crinology

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Neurolo

gyandNe

urosurge

ry

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Or

thopedics

0

0.

2

0.

4

0.

6

0.8

1

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Geriatr

ics

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Gynecolo

gy

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Re

spiratoryD

isorders

0

0.

2

0.

4

0.

6

0.

81

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospital

Score

Urology

0.

0

0.

2

0.

4

0.

6

0.

8

1.

0

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Hospita

l

Score

US

N

DEA


17/27


Table 9 summarises the efficiency score means and standard deviations for each

department from 1000 replications using uniformly distributed POT weights, with the

Ochsner Clinic Foundation in New Orleans being the only hyper-efficient hospital,

meaning it will always be on the frontier for any possible POT bounds. In contrast,

Memorial Sloan-Kettering Cancer Center and University of CA San Francisco Medical

Center are on the frontier for only 28% and 2.4% of the (gi, hj) space respectively (based

on 1000 replications), and no other ENT departments ever were efficient including

those shown in shaded cells previously on the unrestricted frontier when at least one input

or output was ignored, presumably due to the small number of replications and small

region over which they would be efficient in a larger analysis. As shown in Figure 4,

moreover, the DEA scores for all departments usually are larger than and uncorrelated

with the USNWR scores.

3.3 Benchmarking of national healthcare systems

Similar analyses can be conducted to compare entire national healthcare systems.

For example, the World Health Organization (WHO) ranked the performance of 193

countries by assigning equal weights to several dozen measures of overall health,

responsiveness, resources expended, and distribution of services (Musgrove et al., 2000),

although their study received a fair amount of criticism due to data, analysis

methodology, weighting, and fairness issues (Alan, 2001; Jamison and Sandbu, 2001;

Starfield, 2000).

Table 10 Data elements used in DEA study of national healthcare systems

Dimension Data element or surrogate measure

Care and outcomes (output) Healthy life expectancy

Adult non-mortality rate

Infant non-mortality

Morbidity surrogate measure (non-TB rate)

Equity (output) Weighted combination of urban-to-rural under five year mortalityrate, upper-to-lower wealth quartile, and none-to-high educationmother ratios (equity)

Safety (output) Non adverse event rate

Cost and resources (input) Per capita total expenditure

Doctors and nurses per 1000 capita (trained medical people)

Hospital beds per 1000Prevention (input) Surrogate measure (immunisation rate)

Demographics (input) Median age

Notes: All mortality, morbidity, and adverse event outputs converted to non-mortality,non-morbidity, and non-AE rates.


18/27


Table 11 Sample of results for unrestricted CRS output-oriented model using all 180 countries

TB

prevalence

0.00003

0.00001

0.00060

0.00073

0

0.00009

0.00003

0.00039

00.00090

0.00015

00.00030

0.00003

0.00001

0.00051

0.00002

Infant

mortality

rate

0.00412

0.0008

0.01895

0.00001

0.03357

00.017

0.00026

0.003

0.00218

0.04858

0

0.00527

0.00624

0.01862

0

0.00509

0.00071

0.0156

0.00188

Adult

mortality

rate

0.0058

0.0021

0.00770 0

0.0096

0.0116

0.0076

0.0055

0.0015

0.0122

0.0085

0.008100.007700.0070

0.0042

0.0091

0.0018O

utputs

Healthylife

expectancy

atbirth

77.78

0.12

71.17

0.559

60.94

0.283

65.067075.063056.240

0.238

78.046

0.368

70.667

0.607

73.178

0.00001

64.788

0.374

Median

age38.9

1.26

32.7

1.46

24.9

4.35232.75

42.9

0.28

19.8

4.39

38.4

1.77

28.1

2.92

36.5

1.32261.18

Im

munisation

rate

0.07

0.34

0.1767

0.8149

0.397600.2200.01

15.068

0.27200.03

39.535

0.19

0.3903

0.0633

0.5504

0.1733

1.1486

Trained

medical

people

11.9 0 2

.1

0.1134

1.3

0.0551

2.4998

0.0713

9.7704

0.0012

0.97010.3 0

3.0424

0.085

11.3 0 2

.7 0

Inputs

Percap.

spending

$2,6690$61

0.002

$27

0.001

$1640$2,662

0.0002

$13

0.003

$167

0.001

$110$2,1630$146

0.001

Beds360.011

23.11

0.007

5.9820 18 0

129.3702.368050.0020 26 0 3

30.01290.034

T W

T W

T W

T W

T W

T W

T W

T W

T W

T W

Score

0.818

0.624

0.667

1 1 0.902

0.422

0.645

0.847

0.938

Country

Canada

China

India

Jamaica

Japan

Pakistan

Russian

Federation

Turkey

USA

Venezuela

Notes:

T=Target(1strow),

W=Weight(2ndrow).


19/27


As an alternate approach, Benneyan et al. (2007) and Ceyhan and Benneyan (2008)

applied DEA to a subset of these data across six healthcare system dimensions,

summarised in Table 10. In some cases, surrogate measures were used for a general

dimension (e.g., immunisation rates as a marker for preventive care), with a total of five

inputs and six outputs. All data were gathered from the WHO website1 with the exception

of the safety adverse event data, compiled from wrongdiagnosis.com. Although a

small amount of missing data were imputed via multiple regression, thirteen of the 193

countries still were eliminated because most of their data were missing, with equity and

safety measures both available for only 39 countries. Two separate analyses therefore

were conducted, first only on these 39 counties with all measures and then on again all

180 countries, in the latter case both combined and partitioned into each of the WHOs

four economic categories (based in gross national income per capita). The average

healthy life expectancy measure was treated as bounded data (using an arbitrary upper

bound of 80) and transformed via the OR approach.Table 11 summarises a sample (given space limitations) of the unrestricted DEA

results for the larger data set (i.e., without safety and equity), where the first and second

rows for each country again contain the target values and weights, respectively. One

hundred and fifteen of the 180 countries were not on the best-in-class frontier, regardless

of whether they have abundant inputs; for example, Jamaica and Japan both are efficient,

whereas the USA and Turkey both are inefficient. Table 12 summarises the reference sets

for those countries in Table 11, with the percentage weights normalised to sum to 100%

(representing the contributions of the reference countries for each particular healthcare

system to become efficient). Again note that efficient healthcare systems do not have any

others (other than themselves) in their reference sets.

Table 12 Reference sets for unrestricted CRS output-oriented model, listed in decreasing orderof weights

Country Reference set

Canada Jordan (30.8%), Sweden (24.8%), Mexico (18.3%), Oman (10.8%), Iceland(7.8%), Guatemala (7.6%)

China Syrian Arab Rep. (14.7%), Bhutan (11.0%), Eritrea (9.4%), Comoros(5.0%), Vietnam (3.6%)

India Comoros (86.4%), Cape Verde (9.8%), Uganda (2.9%), Guatemala (0.9%)

Jamaica Jamaica (100%)

Japan Japan (100%)

Pakistan Comoros (96.7%), Zambia (2.5%), Guatemala (0.7%)

Russian Federation Syrian Arab Rep. (59.9%), Oman (21.8%), Seychelles (20.5%),Singapore (2.7%)

Turkey Nicaragua (48%), Belize (43.6%), Jamaica (5.0%), Oman (3.5%)

USA Jordan (65.5%), Sweden (22.7%), Iceland (6.2%), Guatemala (4.5%),Mexico (1.2%)

Venezuela El Salvador (40.5%), Comoros (33.3%), Morocco (9.1%), Syrian ArabRep. (8.3%), Singapore (3.5%), Mexico (4.2%), Jordan (1.1%)


20/27


Table 13 lists all countries that were efficient only in the economic group analyses

(left-hand columns) and those that were efficient in both sets of analyses (right-handcolumns), in the second case indicating a sense of strong or robust efficiency and

significant potential value in studying these national systems to gain valuable insights. In

contrast, the USA healthcare system interestingly never exhibits efficiency, presumably

because it does not transform the much higher levels of resources it consumes

into proportionally higher levels of outputs (even under VRS assumptions). Figure 5

illustrates the small correlation between rankings produced by the WHO and DEA studies

(for the CRS output-oriented unrestricted overall model). While almost statistically

significant (p = 0.5192), the agreement is fairly weak with a correlation ofR2 = 0.048.

Thirteen of the WHOs best performing countries are inefficient overall and in their

respective economic groups, with the exceptions of only Japan and Switzerland, whereas

some countries with the fewest healthcare resources and ranked poorly by the WHO are

efficient in the DEA analysis.

Table 13 Summary of efficient national healthcare systems, overall and withineconomic groups

Efficient only within economic group Efficient both within and between economic groups

Andorra

Bahrain

Brunei Darussalam

Canada

Colombia

Cuba

Democratic Republicof the Congo

Djibouti

Equatorial Guinea

Grenada

Hungary

Indonesia

Iran

Kuwait

Libyan Arab

JamahiriyaMaldives

Mauritius

Namibia

Philippines

Qatar

Republic ofKorea

Saudi Arabia

Slovakia

Uzbekistan

Venezuela

Zimbabwe

Antigua &Barbuda

Bangladesh

Belarus

Belize

Benin

Bhutan

Burundi

Cape Verde

Chile

Comoros

Costa Rica

Cyprus

CzechRepublic

Dominica

EcuadorEl Salvador

Eritrea

Ethiopia

Finland

Gambia

Guatemala

Haiti

Honduras

Iceland

Israel

Jamaica

Japan

Jordan

KyrgyzstanMalaysia

Mexico

Morocco

Mozambique

Nepal

Nicaragua

Niger

Oman

Panama

Paraguay

Rwanda

Seychelles

Sierra Leone

Singapore

Slovenia

Somalia

Spain

Sri Lanka

Swaziland

Sweden

Switzerland

SyrianArab Republic

Tajikistan

Tonga

Uganda

UnitedRepublicof Tanzania

Vietnam

Zambia


21/27


Figure 5 Low correlation between DEA and WHO rankings of national healthcare systems

(see online version for colours)

Comparison of WHO vs. DEA rankings

0

50

100

150

200

0 50 100 150 200

WHO Ranking

DEA

Ranking

r2 = .0196

Comparison of WHO versus DEA rankings

Notes: R2 = 0.048,p = 0.5192.

3.4 Benchmarking at the regional state level

The same type of analysis also can be used to benchmark state and regional healthcaresystems. In 2007, the Commonwealth Fund published a comparison of the relative

performances of the healthcare systems of all US states based on a (subjectively)

weighted score card analysis of 32 measures in five dimensions of care: outcomes,

quality, access, efficiency, and equity (Cantor et al., 2007). Their general methodology

consensus-ranked the states for each measure separately, then rank ordered the systems

within each dimension based on the average of their measure ranks within that

dimension, and finally rank ordered the overall state healthcare systems based on their

average dimension ranks. As an alternative, using a subset of these data, shown in

Table 14, Table 15 summarises the results of a DEA comparison of the state healthcare

systems (Benneyan et al., 2007). Again, note that the unbounded model assigned

zero weights (italic cells) to some performance measures in order for many states to

appear efficient.

Conducting the same analysis with the weight restriction constraints listed below

results in a 12.1% average decrease in efficiency scores, with only Hawaii, Maine,

Massachusetts, Minnesota, Utah, and Vermont remaining efficient:

v4 > v1, v4 > v2, v4 > v3 (5)

v5 > v1, v5 > v2, v5 > v3 (6)

u4 > u7, u4 > u8 (7)


22/27


u5 > u7, u5 > u8 (8)

u9 > u11 > u6 > u5, u6 > u4 (9)

u10 > u1. (10)

Figure 6 compares the frequency that each state is in anothers reference set in the

restricted and unrestricted cases, with Hawaii and Utah being the most frequent

benchmarks in the weight restricted model, followed by Massachusetts, Minnesota,

Michigan, and Vermont. As shown in Figure 7 and in contrast to the WHO results, fairly

strong correlation exists between the CWF and weight-restricted DEA ranks (R2 = 0.687,

p = 0.000000037). In general, those states in the top quartile of the CWF study also were

efficient in the DEA analysis, with the exception of New Hampshire, Rhode Island,

Connecticut, Nebraska, and North Dakota; conversely, Utah was ranked 24th by the

CWF but was still on the DEA efficiency frontier.

Table 14 Data elements used in DEA analysis of state healthcare systems (usingCommonwealth Fund data)

Dimension Weight Element

Access(outputs)

u1

u2

u3

Adults under age 65 insured (O1)

Children insured (O2)

Adults visited a doctor in past two years (O3)

Quality(outputs)

u4

u5

u6

u7

u8

Percent of adults age 50 and older received recommended screeningand preventive care (O4)

Percent of children ages 1935 months received all recommended dosesof five key vaccines (O5)

Percent of hospitalised patients received recommended care for acutemyocardial infarction, congestive heart failure, and pneumonia (O6)

Adults with a usual source of care (O7)

Children with a medical home (O8)

Healthylives(outputs)

u9

u10

u11

Non-Mortality amenable to healthcare, deaths per 100 000population (O9)

Infant non-mortality, deaths per 1000 live births (O10)

Percent of adults under age 65 unlimited in any activitiesbecause of physical, mental, or emotional problems (O11)

Cost ofcare

(inputs)

v1

v2

v3

v4

v5

Medicare hospital admissions for ambulatory care sensitive conditionsper 100 000 beneficiaries (I1)

Medicare 30-day hospital readmissions as a percent of admissions (I2)

Percent of home health patients with a hospital admission (I3)

Total single premium per enrolled employee at private-sectorestablishments that offer health insurance (I4)

Total Medicare (Parts A and B) reimbursements per enrollee (I5)


23/27


Table 15 Results of DEA analysis of state healthcare systems without weight restrictions

Ref

sets

CA

DC

AZ,HI,MD,

MA,UT

HI

MA

HI,MD,

RI,VT

CT,HI,IA,ME,

MA,

NH,RI

CI,DE,DC,HI,

MD,MN,RI

CA,IA,UT

WY

O11

10.90

0.06

10.8 0

15.17

014.23

013.70

014.18

013.47

0 14 0.03

11.38

0.064

14.5 0

O10

994

0.0021

989 0

993.2

0992 0

995.2

0994.6

0.0016

994.4

0992 0

995.2

0.0034

993.3

0

O9

99907

099840

099907

0.0004

99913

0.0004

99914

099912

099913

099894

099930

099923

0

O8 37.5

0 45.2

0 47.1

0 45.3

0 60.3

0 57.2

0 56.1

0.22

8

52.0

0 50.9

0 40.5

0.8416

O7

71.1 0

77.7 0

80.1 0

81.8 0

87.1 0

85.8 0

86.0

0.030

83.9

0.0124

82.2 0

74.9 0

O6

79.4 0

83.9 0

82.6 0

79.9 0

85.8 0

84.8 0

86.8

0.033

85.0 0

87.1 0

80.3 0

O5

77.9

0.0357

73.5

0.0251

80

0.0082

80.1

0.0654

93.5

0.0695

91.7 0

86.0

0.0012

81.5 0

82.7

0.0224

78.6 0

O4

37.4 0

45.6 0

42.019

0.1069

36.6 0

46.7 0

44.374

0.4186

44.961

045.379

0.7712

42.915

037.3 0

O3

76.7 0

91.5

0.086

86.546

0.078

88.9

0.03

90.3 0

89.8

0.056

87

0.010

88.5

0.034

82.8 0

73.9 0

O2

87 0

92.8 0

90.84

094.7 0

94.8 0

94.6 0

93.67

091.98

0.001

93.3 0

89.3 0

Outputs

O1

75.5 0

83.3 0

81.86

087.2 0

85.4 0

85.66

086.32

0.062

84.5 0

86.3 0 81 0

I57424

06312

0.00009

5937

04530

0.00022

7804

0.00013

6835.5

06014.6

05975

0.00004

6009

05323

0

I43534

04218

03328

03119

04141

03858

0.00014

3782

0.00019

3773

0.0002

3781

0.00035

3761

0.00032

I3 21.9

0.0456

27.3

0.01481

21.2

0.04402

24.7 0 29 0

27.425

029.3

0.01497

24.86

028.169

025.6 0

I2 18.2 0

20.4 0

17.4

0.0065

14.5

0.000

19.8

0.000

17.9

0.0326

15.777

0.000

16

0.0119

17.24

0.000

13.3 0

Inputs

I16383

08101

05794

04069

07830

06831

06480

06683

06156

06016

0

T W

T W

T W

T W

T W

T W

T W

T W

T W

T W

Score

1 1 0.955

1 1 0.90

0.85

0.86

0.75 1

Country

CA

DC

FL

HI

MA

NY

OH

SC

TX

WY

Notes:

T=Target,W=Weight.


24/27


Figure 6 Frequency of state healthcare system being in a benchmark reference set (see online

version for colours)

0

4

8

12

16

20

24

28

32

AZ

CA

CO

CT

DE

DC H

IIAM

EMD

MA M

I

MN

NE

NH

NJ

NC

ND

OR

PA R

I

SD

UT

VT

WA

WV

WY

Frequency(%)

Restricted weights Unrestricted weights

Figure 7 Comparison of Commonwealth Fund and weight-restricted DEA rankings of statehealthcare systems (see online version for colours)

Comparison of State Results

0

10

20

30

40

50

60

0 10 20 30 40 50 6

The Commonwealth Fund Ranking

DEARanking

0

R = 0.687

= 0.0000000037

4 Conclusion

DEA is an effective benchmarking tool that can help identify systems and processes on

the best practice frontier, provide actionable targets to transform non-frontier systems to

best-in-class, and identify comparators that each system should study and emulate. As

such, DEA is a useful complement to other benchmarking methods and often produces


25/27


different conclusions or additional insight, underscoring both its value and the value of

more quantitatively considering the amounts of input resources consumed relative to

outputs produced.

DEA adds particular value when there are multiple inputs and outputs to consider and

when the relationships and best weighting structure among them are not immediately

transparent, with the additional advantages of determining empirically achievable targets,

identifying non-frontier DMUs that never can be called best under any weighting

scheme, and discovering possibly otherwise unidentified processes of excellence that

other methods may miss. Good examples of this are the identification of Jamaica,

Pakistan, Hawaii, and Utah as having very efficient healthcare systems, while most

comparison and reform discussions tend to focus on a handful of more developed

countries or industrialised states. Many of the DEA-best hospitals identified in

Section 3.1 also typically are under-examined by Solucient, USNWR, and other popular

benchmarking studies.In contrast to these and other typical analyses that assign subjective or consensus

weighting schemes to each of several criteria, DEA determines each systems optimal

weights that maximise its score relative to the others. Since no other combination of

weights can produce a higher relative score, results can be thought of as optimistic in the

sense that DEA computes the best possible case for each DMU; conversely, any system

not on the DEA frontier can never be efficient for any other set of weights, however

chosen. An additional interpretation of the computed weights is that they somewhat

reflect each systems intrinsic tradeoff values, lending insight to management styles

and dispositions.

It also is important to understand the meaning of being on the DEA frontier and to not

misinterpret results, namely that such DMUs are the most efficient among the particular

set of DMUs being considered at transforming inputs into outputs, whereas inefficientcountries and hospitals (such as the US healthcare system and Tampa General Hospitals

ENT department) still may produce very good outcomes, just at disproportionate costs.

Since it is a relative rather than absolute measurement method, inefficient DMUs also

might perform very efficiently but just be outshone by others; conversely, the most

efficient DMUs may not exhibit much excellence but simply be the best among a bad lot.

A sufficient number of DMUs also should be used to obtain useful differentiation

between them, with a common rule-of-thumb being that it should exceed twice the total

number of input and output categories. Too few DMUs or too many inputs and outputs

can allow almost any system to appear efficient by placing most weight on a few

variables in which it might excel, greatly limiting the practical value of the analysis

(although this is less true as more weight restrictions are imposed). It also is important to

structure the data such that all inputs and outputs are independent of one another for

theoretical reasons (total operating costs and average physician salary being a possiblecounter-example). Finally, if the analyst has additional modelling insight about how

inputs are transformed into outputs, related methods such as stochastic frontier analysis

also can be appropriate.

With roughly 5700 hospitals in the USA alone, the potential to improve healthcare

systems via benchmarking is significant. Basic DEA models usually will be sufficient for

this purpose, although in some cases modelling issues such as those discussed above

necessitate alternate methods. As demonstrated, the weight restriction and OR models

offer the analyst simple solutions in such cases. Software and spreadsheet macros to


26/27


perform all conventional and modified analyses illustrated in this paper are available

from the lead author. Although treating the above examples with VRS or input-orientedmodels may produce different results, the primary intent here was to demonstrate the

types of analyses possible and how they can be useful to improvement activities at

department, hospital, or national levels. While not explored here, in a similar manner

DEA also can be used to benchmark the performance of individual providers, such

as cardiac surgeons (Chilingerian, 1995). Taking a different viewpoint, Feng and Antony

(2008) described using the DMAIC process to execute a DEA study, with each activity in

the analysis mapping to one of the lettered steps (Define inputs and outputs, Measure

their values, Analyse DEA results, Improve by benchmarking reference DMUs, and

Control by measuring efficiency over time).

ReferencesAksezer, C. and Benneyan, J.C. (2003) Handling missing values in data envelopment analysis,

QPL-2003-02 Research Paper, Northeastern University.

Alan, W. (2001) Science or marketing at WHO, a commentary on World Health 2000, HealthEconomics, Vol. 10, pp.93100.

Amemiya, T. (1985)Advanced Econometrics, Cambridge: Harvard University Press.

Benneyan, J.C., Ceyhan, M.E. and Sunnetci, A. (2007) Data envelopment analysis of nationalhealthcare systems and their relative efficiencies, The 37th International Conference onComputers and Industrial Engineering, pp.251261.

Benneyan, J.C. and Sunnetci, A. (2008) Handling Proportional and Bounded Values in DataEnvelopment Analysis, in review.

Benneyan, J.C., Sunnetci, A. and Aksezer, C. (2006) Identifying healthcares Toyotas: productionfrontiers, hospital quality indices, and modeling issues, International Computers in Industrial

Engineering Conference, p.3596.Burstin, H.R., Conn, A., Setnik, G., Rucker, D.W., Cleary, P.D., ONeil, A.C., Orav, E.J.,

Sox, C.M. and Brennan, T.A. (1999) Benchmarking and quality improvement: the Harvardemergency department quality study,American Journal of Medicine, Vol. 107, pp.437449.

Cantor, J.C., Schoen, C., Belloff, D., How, S.K.H. and McCarthy, D. (2007) Aiming Higher:Results from a State Scorecard on Health System Performance, The Commonwealth FundCommission on a High Performance Health System.

Ceyhan, M.E. and Benneyan, J.C. (2008) Data envelopment estimates for the most efficientnational healthcare systems given uncertain proportional rate inputs, IIE IndustrialEngineering Research Conference, pp.17601765.

Charnes, A., Cooper, W.W. and Rhodes, A. (1981) Evaluating program and managerial efficiency:an application of data envelopment analysis to program follow through, ManagementScience, Vol. 27, pp.668697.

Chenoweth, J. (2003) Benchmarking could save hospitals billions: mortality, complications couldbe reduced,Healthcare Benchmarks and Quality Improvement, p.10.

Chilingerian, J.A. (1995) Evaluating physician efficiency in hospitals: a multivariate analysis,European Journal of Operational Research, Vol. 80, No. 3, pp.548574.

Collins-Fulea, C., Mohr, J.J. and Tillett, J. (2005) Improving midwifery practice: the Americancollege of nurse-midwives benchmarking project, Journal of Midwifery & Womens Health,Vol. 50, pp.461471.

Cooper, W., Seiford, L. and Tone, K. (2000) Data Envelopment Analysis: A Comprehensive Textwith Models, Applications, References, Kluwer.


27/27


Doyle, J. and Green, R. (1994) Efficiency and cross-efficiency in DEA: derivations, meanings and

uses,Journal of the Operational Research Society, Vol. 45, No. 5, pp.567578.

Feng, Q. and Antony, J. (2008) Integrating data envelopment analysis into Six Sigma methodologyfor measuring physician productivity,IIE Industrial Engineering Research Conference.

Jamison, D.T. and Sandbu, M.E. (2001) Global health: WHO ranking of health systemperformance, Science, Vol. 293, pp.15951596.

McFarlane, E., Murphy, J., Olmsted, M.G., Drozd, E.M. and Hill, C. (2007) Americas besthospitals 2007 methodology, U.S. News & World Report.

McNair, C.J. and Leibfried, K.H.J. (1992) Benchmarking: A Tool for Continuous Improvement,John Wiley and Sons, Inc.

Medina-Borja, A., Pasupathy, K.S. and Triantis, K. (2007) Large-scale Data EnvelopmentAnalysis (DEA) implementation: a strategic performance management approach, Journal ofthe Operational Research Society, Vol. 58, pp.10841098.

Musgrove, P., Creese, A., Preker, A., Baeza, C., Anell, A. and Prentice, T. (2000) The World

Health Report 2000 Health Systems: Improving Performance, World Health Organization.

Sexton, T., Silkman, R. and Hogan, A. (1986) Data envelopment analysis: critique andextensions, in R. Silkman (Ed.) Measuring Efficiency: An Assessment of Data EnvelopmentAnalysis, San Francisco: Jossey Bass.

Starfield, B. (2000) Is US health really the best in the world?, Journal of American MedicalAssociation, Vol. 284, pp.483500.

Stevenson, W.J. (1998) Production and Operations Management, McGraw-Hill.

Sunnetci, A. and Benneyan, J.C. (2008) Weight restricted DEA models to identify the best U.S.hospitals,IIE Industrial Engineering Research Conference, pp.17481753.

Note

1 www.who.int/en/

six sigma healthcare dea final paper

Documents