data envelopment analysis with imprecise data

13
Continuous Optimization Data envelopment analysis with imprecise data Dimitris K. Despotis a, * , Yiannis G. Smirlis b a Department of Informatics, University of Piraeus, 80 Karaoli & Dimitriou Str., 18534 Piraeus, Greece b Department of Statistics and Actuarial Science, University of Piraeus, 80 Karaoli & Dimitriou Str., 18534 Piraeus, Greece Received 19 September 2000; accepted 4 June 2001 Abstract In original data envelopment analysis (DEA) models, inputs and outputs are measured by exact values on a ratio scale. Cooper et al. [Management Science, 45 (1999) 597–607] recently addressed the problem of imprecise data in DEA, in its general form. We develop in this paper an alternative approach for dealing with imprecise data in DEA. Our approach is to transform a non-linear DEA model to a linear programming equivalent, on the basis of the original data set, by applying transformations only on the variables. Upper and lower bounds for the efficiency scores of the units are then defined as natural outcomes of our formulations. It is our specific formulation that enables us to proceed further in discriminating among the efficient units by means of a post-DEA model and the endurance indices. We then proceed still further in formulating another post-DEA model for determining input thresholds that turn an inefficient unit to an efficient one. Ó 2002 Elsevier Science B.V. All rights reserved. Keywords: Data envelopment analysis; Interval data; Ordinal data; Imprecise data 1. Introduction Data envelopment analysis (DEA) is a non-parametric method for evaluating the relative efficiency of decision-making units (DMUs) on the basis of multiple inputs and outputs. The original DEA models [2] assume that inputs and outputs are measured by exact values on a ratio scale. Recently, Cooper et al. [6] addressed the problem of imprecise data in DEA, in its general form. The term ‘‘imprecise data’’ reflects the situation where some of the input and output data are only known to lie within bounded intervals (interval numbers) while other data are known only up to an order. Imprecise DEA (IDEA), proposed in that work, is the first unified approach for dealing directly with imprecise data in DEA (bounds and/or rankings imposed directly on input/output data). In the same work, IDEA was extended to AR-IDEA to include the assurance region approach (fixed or ratio bounds and/or ordinal relations imposed on the weights) European Journal of Operational Research 140 (2002) 24–36 www.elsevier.com/locate/dsw * Corresponding author. Tel.: +30-1-4142315; fax: +30-1-4142357. E-mail address: [email protected] (D.K. Despotis). 0377-2217/02/$ - see front matter Ó 2002 Elsevier Science B.V. All rights reserved. PII:S0377-2217(01)00200-4

Upload: dimitris-k-despotis

Post on 04-Jul-2016

220 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Data envelopment analysis with imprecise data

Continuous Optimization

Data envelopment analysis with imprecise data

Dimitris K. Despotis a,*, Yiannis G. Smirlis b

a Department of Informatics, University of Piraeus, 80 Karaoli & Dimitriou Str., 18534 Piraeus, Greeceb Department of Statistics and Actuarial Science, University of Piraeus, 80 Karaoli & Dimitriou Str., 18534 Piraeus, Greece

Received 19 September 2000; accepted 4 June 2001

Abstract

In original data envelopment analysis (DEA) models, inputs and outputs are measured by exact values on a ratio

scale. Cooper et al. [Management Science, 45 (1999) 597–607] recently addressed the problem of imprecise data in DEA,

in its general form. We develop in this paper an alternative approach for dealing with imprecise data in DEA. Our

approach is to transform a non-linear DEA model to a linear programming equivalent, on the basis of the original data

set, by applying transformations only on the variables. Upper and lower bounds for the efficiency scores of the units are

then defined as natural outcomes of our formulations. It is our specific formulation that enables us to proceed further in

discriminating among the efficient units by means of a post-DEA model and the endurance indices. We then proceed still

further in formulating another post-DEA model for determining input thresholds that turn an inefficient unit to an

efficient one. � 2002 Elsevier Science B.V. All rights reserved.

Keywords: Data envelopment analysis; Interval data; Ordinal data; Imprecise data

1. Introduction

Data envelopment analysis (DEA) is a non-parametric method for evaluating the relative efficiency ofdecision-making units (DMUs) on the basis of multiple inputs and outputs. The original DEA models [2]assume that inputs and outputs are measured by exact values on a ratio scale. Recently, Cooper et al. [6]addressed the problem of imprecise data in DEA, in its general form. The term ‘‘imprecise data’’ reflects thesituation where some of the input and output data are only known to lie within bounded intervals (intervalnumbers) while other data are known only up to an order. Imprecise DEA (IDEA), proposed in that work,is the first unified approach for dealing directly with imprecise data in DEA (bounds and/or rankingsimposed directly on input/output data). In the same work, IDEA was extended to AR-IDEA to include theassurance region approach (fixed or ratio bounds and/or ordinal relations imposed on the weights)

European Journal of Operational Research 140 (2002) 24–36

www.elsevier.com/locate/dsw

*Corresponding author. Tel.: +30-1-4142315; fax: +30-1-4142357.

E-mail address: [email protected] (D.K. Despotis).

0377-2217/02/$ - see front matter � 2002 Elsevier Science B.V. All rights reserved.

PII: S0377 -2217 (01 )00200 -4

Page 2: Data envelopment analysis with imprecise data

[1,9–11,14]. As long as interval and ordinal numbers are concerned in DEA, the CCR DEA model becomesnon-linear as apart from the weights, the levels of inputs and outputs are also variables to be estimated. Theapproach in IDEA is to transform the non-linear model to a linear programming equivalent, by imposingscale transformations on the data and variable alterations (products of variables are replaced by newvariables).

Prior to IDEA, pertinent work was that of Cook et al. [4,5], which, however, is confined only tomixtures of exact and ordinal data. They started by dealing with only one ordinal input [4] and thenextended their model [5] to handle multiple cardinal and ordinal criteria. The basic idea in these models isto assign new auxiliary variables, one for every combination of ordinal variables and distinct ranks of it.The value for such an auxiliary variable corresponding to a DMU j and a rank position k is 1, if DMU jis rated in the kth place of the ordinal variable and 0 otherwise. The associated weights are then re-stricted accordingly to represent the weak or strict ordinal relations between the rank positions. In thecase of strict ordinal relations, a positive scalar e is used as a discrimination parameter between con-secutive rank positions (sometimes different parameters are used for different criteria due to differentmeasurement scales). As the results are sensitive in perturbations of e, various technical arrangements areelaborated to evaluate a common parameter value for all criteria. Extensions of these basic ideas arereported in [3,12].

In this paper we develop an alternative approach for dealing with imprecise data (mixtures of exact,interval and ordinal data in the same setting). We transform the non-linear DEA model to a linear pro-gramming equivalent by using a straightforward formulation, completely different than that in IDEA.Contrarily to IDEA, our transformations on the variables are made on the basis of the original data set,without applying any scale transformations on the data. The original CCR DEA model with exact data, inits multiplier form, derives then straightforwardly as a special case of our model. The potential of ourtransformations enables us to uncover and thoroughly examine some new aspects of efficiency in an im-precise data setting, such as the variation of the efficiency scores of the units. On the basis of our particulartransformations, new models are naturally introduced to estimate upper and lower bounds of the efficiencyscores of the units, as well as to classify and further discriminate the units in terms of the variability of theirefficiency scores. Moreover, we address and solve the problem of determining input thresholds that turn aninefficient unit to an efficient one, in an imprecise data setting. In Section 2, we formulate a DEA model fordealing with interval data. Then, on the basis of this model, we define upper and lower bound efficienciesfor the units. Then we develop a post-DEA benevolent formulation [8,13] by which we examine the degreeto which a unit can maintain its efficiency score in a favourable, for the other units, data setting. For thispurpose we introduce endurance indices that we use to discriminate further among the efficient units. InSection 3, we extend our interval DEA model to incorporate ordinal data, thus dealing with the moregeneral case of imprecise data. We take, for this purpose, the line of Cook et al. [5]. In Section 4, we proceedstill further in formulating another post-DEA model, by which we determine the changes that should bemade in interval inputs in order to turn an inefficient unit to an efficient one. Conclusions are given inSection 5.

2. A DEA model with interval data

Assume n units, each using m inputs to produce s outputs. We denote by yrj the level of the rth outputðr ¼ 1; . . . ; sÞ from unit j ðj ¼ 1; . . . ; nÞ and by xij the level of the ith input ði ¼ 1; . . . ;mÞ to the jth unit.Unlike the original DEA model, we assume further that the levels of inputs and outputs are not knownexactly; the true input–output data are known to lie within bounded intervals, i.e. xij 2 ½xLij; xUij � andyrj 2 ½yLrj; yUrj �, with the upper and lower bounds of the intervals given as constants and assumed strictlypositive. In such a setting, the following CCR DEA model:

D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36 25

Page 3: Data envelopment analysis with imprecise data

max hj0 ¼Xs

r¼1

uryrj0

s:t:Xmi¼1

vixij0 ¼ 1;

Xs

r¼1

uryrj �Xmi¼1

vixij 6 0; j ¼ 1; . . . ; n;

ur; vi P e 8 r; i;

ð1Þ

is non-linear (non-convex) as, apart from the original variables u1; . . . ; ur; . . . ; us and v1; . . . ; vi; . . . ; vm(weights for outputs and inputs, respectively), the levels of inputs xij and outputs yrj are also variables whoseexact values are to be estimated.

2.1. Model formulation

Our approach is to transform the above model (1) into an equivalent linear program. First we apply thefollowing transformations to the variables xij and yrj:

xij ¼ xLij þ sijðxUij � xLijÞ; i ¼ 1; . . . ;m; j ¼ 1; . . . ; n with 06 sij 6 1;

yrj ¼ yLrj þ trjðyUrj � yLrjÞ; r ¼ 1; . . . ; s; j ¼ 1; . . . ; n with 06 trj 6 1:

With these transformations, the variables xij and yrj in model (1) are replaced by the new variables sij and trj,which locate the levels of inputs and outputs within the bounded intervals ½xLij; xUij � and ½yLrj; yUrj �, respectively.Model (1) still remains non-linear due to the products of variables visij for inputs and urtrj for outputs. Wethen replace these products with new variables qij ¼ visij and prj ¼ urtrj. According to these transformations,the weighted sum of inputs (composite input) for unit j in model (1) takes the formXm

i¼1

vixij ¼Xmi¼1

vi½xLij þ sijðxUij � xLijÞ� ¼Xmi¼1

vixLij þ visijðxUij � xLijÞ ¼Xmi¼1

vixLij þ qijðxUij � xLijÞ;

where the new variables qij meet the conditions 06 qij 6 vi, as it is sij ¼ qij=vi with vi P e and 06 sij 6 1 forevery i and j. Similarly, the weighted sum of outputs (composite output) for unit j takes the formXs

r¼1

uryrj ¼Xs

r¼1

ur½yLrj þ trjðyUrj � yLrjÞ� ¼Xs

r¼1

uryLrj þ urtrjðyUrj � yLrjÞ ¼Xs

r¼1

uryLrj þ prjðyUrj � yLrjÞ

with 06 prj 6 ur for every r and j as explained above.With the above substitutions, model (1) is finally transformed into the following linear program:

max hj0 ¼Xs

r¼1

uryLrj0 þ prj0ðyUrj0 � yLrj0Þ

s:t:Xmi¼1

vixLij0 þ qij0ðxUij0 � xLij0Þ ¼ 1;

Xs

r¼1

uryLrj þ prjðyUrj � yLrjÞ �Xmi¼1

vixLij þ qijðxUij � xLijÞ6 0; j ¼ 1; . . . ; n;

prj � ur 6 0; r ¼ 1; . . . ; s; j ¼ 1; . . . ; n;

qij � vi 6 0; i ¼ 1; . . . ;m; j ¼ 1; . . . ; n;

ur; vi P e 8 r; i;prj P 0; qij P 0 8 r; i; j:

ð2Þ

26 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36

Page 4: Data envelopment analysis with imprecise data

The CCR DEA model with exact input–output data derives as a special case of model (2). Indeed, if thelower and upper bounds coincide for all inputs and outputs, the bounded intervals are all of zero length.Therefore, the second terms in the summations vanish together with the variables prj and qij, the constraintsof the form prj � ur 6 0 and qij � vi 6 0 are eliminated and model (2) is reduced to model (1). If the upperand lower bounds coincide for only some particular entries of Table 1, it is obvious that model (2) canhandle exact and interval data in the same setting.

In model (2), the evaluated unit j0 adjusts not only the weights but also the levels of inputs and outputswithin their ranges that are in favour of it. The latter is accomplished through the variables qij and prj.Particularly, qij ¼ 0 ðprj ¼ 0Þ if and only if the level of input i (output r) for unit j is set equal to xLij ðyLrjÞ.Accordingly, qij ¼ vi ðprj ¼ urÞ if and only if the level of input i (output r) for unit j is set equal to xUij ðyUrj Þ.These statements are easily proved by means of the transformations introduced above.

2.2. Upper and lower bounds of efficiency scores

When the unit j0 is evaluated by model (2), the levels of inputs and outputs of all the units, together withthe associated weights, are adjusted in favour of unit j0. In this manner, the efficiency score attained by unitj0 in model (2) (say hj0 ) is not worse (less) than any other efficiency score that the unit might attain, byadjusting the inputs and the outputs in a different way within the limits of the bounded intervals. However,the transformations introduced in Section 2.1 enable us to define explicitly upper and lower bounds of thepossible efficiency scores that the unit j0 might attain in an interval data setting. Indeed, the followingmodel provides such an upper bound for unit j0:

max Hj0 ¼Xs

r¼1

uryUrj0

s:t:Xmi¼1

vixLij0 ¼ 1;

Xs

r¼1

uryUrj0 �Xmi¼1

vixLij0 6 0;

Xs

r¼1

uryLrj �Xmi¼1

vixUij 6 0; j ¼ 1; . . . ; n; j 6¼ j0;

ur; vi P e 8 r; i:

ð3Þ

Model (3) is a DEA model with exact data, where the levels of inputs and outputs are adjusted in favour ofthe evaluated unit j0 and aggressively against the other units. For the evaluated unit, the inputs are adjustedat the lower bounds and the outputs at the upper bounds of the intervals. Unfavourably for the other units,the inputs are contrarily adjusted at their upper bounds and the outputs at their lower bounds. If we denoteby hUj0 the efficiency score attained by unit j0 in model (3), we show in Appendix A that it is hj0 ¼ hUj0 . Itderives that the DEA model with interval data can be solved by a sequence of linear programs with exact

Table 1

Interval data, efficiency scores and endurance indices

DMU Inputs Outputs Efficiencies and classification Endurance indices

j x1j x2j y1j y2j hLj hj q r

1 12 15 0.21 0.48 138 144 21 22 0.224 1 Eþ 0.73 0.25

2 10 17 0.1 0.7 143 159 28 35 0.227 1 Eþ 1 0.86

3 4 12 0.16 0.35 157 198 21 29 0.823 1 Eþ 1 1

4 19 22 0.12 0.19 158 181 21 25 0.445 0.907 E� 0.44 0.19

5 14 15 0.06 0.09 157 161 28 40 1 1 Eþþ 1 1

D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36 27

Page 5: Data envelopment analysis with imprecise data

data of the form (3), for a unit at a time. The benefit, however, of using model (2) instead of model (3) isobvious; the former facilitates repetitive uses for successive evaluations of the units, exactly as it is donewith the classical DEA. Implementation of model (3), on the other hand, invites for extra programmingeffort as, when a new unit takes the place of the evaluated unit j0, the coefficients of the linear programchange drastically and one must start afresh for each evaluation. In the next section we will unfold someother useful issues based on model (2).

The model below provides a lower bound of the efficiency scores for unit j0:

max Fj0 ¼Xs

r¼1

uryLrj0

s:t:Xmi¼1

vixUij0 ¼ 1;

Xs

r¼1

uryLrj0 �Xmi¼1

vixUij0 6 0;

Xs

r¼1

uryUrj �Xmi¼1

vixLij 6 0; j ¼ 1; . . . ; n; j 6¼ j0;

ur; vi P e 8 r; i:

ð4Þ

Model (4) is also a DEA model with exact data. Contrarily to model (3), however, the levels of inputs andoutputs are now adjusted unfavourably for the evaluated unit j0 and in favour of the other units. For unitj0, the inputs are adjusted at their upper bounds and the outputs at their lower bounds. For the other units,the inputs are favourably adjusted at their lower bounds and the outputs at their upper bounds. In thismanner, unit j0 comes in the worst possible position vis-�aa-vis the other units. Therefore, the efficiency (sayhLj0 ) attained by unit j0 in model (4) serves as a lower bound of its possible efficiency scores. So, the models(2) and (4) provide for each unit a bounded interval ½hLj ; hj � , in which its possible efficiency scores lie, fromthe worst to the best case. It is clear, however, that model (4) suffers all the implementation peculiaritiesdiscussed above for model (3).

2.3. Classification and discrimination of the units

In an interval data setting, many units are likely to be proved efficient, as apart from the flexibility theyhave in choosing the weights, they are also free to adjust the levels of inputs and outputs in a favourablemanner within the intervals. Thus further discrimination of the efficient units becomes more essential in aninterval data setting. On the basis of the above efficiency score intervals, the units can be first classified inthree subsets as follows:

Eþþ ¼ fj 2 J=hLj ¼ 1g;Eþ ¼ fj 2 J=hLj < 1 and hj ¼ 1g;E� ¼ fj 2 J=hj < 1g;

where J stands for the index set f1; . . . ; ng of the units. The set Eþþ consists of the units that are efficient inany case (any combination of input/output levels). The set Eþ consists of units that are efficient in amaximal sense, but there are input/output adjustments under which they cannot maintain their efficiency.Finally, the set E� consists of the definitely inefficient units. Moreover, the range of possible efficiencyscores can be used to rank further the units in the set Eþ.

After all, it is now possible to investigate the endurance of the efficiency scores, mainly for the efficientunits in Eþ. Consider for this purpose the post-DEA model below:

28 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36

Page 6: Data envelopment analysis with imprecise data

max f ¼Xs

r¼1

Xn

j¼1

prj �Xmi¼1

Xn

j¼1

qij

s:t:Xs

r¼1

uryLrj0 þ prj0ðyUrj0 � yLrj0Þ ¼ hj0 ;

Xmi¼1

vixLij0 þ qij0ðxUij0 � xLij0Þ ¼ 1;

Xs

r¼1

uryLrj þ prjðyUrj � yLrjÞ �Xmi¼1

vixLij þ qijðxUij � xLijÞ6 0; j ¼ 1; . . . ; n;

prj � ur 6 0; r ¼ 1; . . . ; s; j ¼ 1; . . . ; n;

qij � vi 6 0; i ¼ 1; . . . ;m; j ¼ 1; . . . ; n;

ur; vi P e 8 r; i;prj P 0; qij P 0 8 r; i; j:

ð5Þ

Model (5) is a second-stage benevolent formulation that adjusts, in terms of averages, the levels of inputsand outputs in favour of the competitive units while preserving the maximal efficiency score of the eval-uated unit j0. When applied to units in Eþ and in comparison to the best-case data setting, for j0, of model(3), model (5) figures out the extent to which unit j0 can maintain its 100% efficiency score in a favourablefor the other units data setting, with decreased inputs and increased outputs. To measure the endurance ofthe efficiency score for the evaluated unit j0, we introduce the indices q and r, for inputs and outputs,respectively, as follows:

qj0 ¼1

mðn� 1ÞXn

j¼1j 6¼j0

Xmi¼1

xUij � �xxijxUij � xLij

; rj0 ¼1

sðn� 1ÞXn

j¼1j 6¼j0

Xs

r¼1

�yyrj � yLrjyUrj � yLrj

;

where �xxij, i ¼ 1; . . . ;m; j ¼ 1; . . . ; n ðj 6¼ j0Þ and �yyrj, r ¼ 1; . . . ; s; j ¼ 1; . . . ; n ðj 6¼ j0Þ are the input andoutput levels adjusted for the competitive units by model (5), when solved for unit j0. The adjusted levels �xxijand �yyrj are easily computed, respectively, by means of the optimal values of the variables qij, vi and prj, ur inmodel (5) and the transformations introduced earlier in this chapter. Actually, if ðv0i , u0r , q0ij, p0rj; i ¼ 1; . . . ;m;r ¼ 1; . . . ; s; j ¼ 1; . . . ; nÞ is the optimal solution of model (5), then ðs0ij ¼ q0ij=v

0i ; i ¼ 1; . . . ;m; j ¼ 1; . . . ; nÞ

and ðt0rj ¼ p0rj=u0r ; r ¼ 1; . . . ; s; j ¼ 1; . . . ; nÞ. The adjusted levels of inputs and outputs then take the values

�xxij ¼ xLij þ s0ijðxUij � xLijÞ and �yyrj ¼ yLrj þ t0rjðyUrj � yLrjÞ, respectively. The indices q and r are bounded between 0and 1. The highest are the values of q and r the most endurable is the efficiency score of the evaluated unit, interms of inputs and outputs, respectively, in a benevolent data setting for the competitive units. Particularly,if qj0 ¼ 1, then the unit j0 maintains its maximal efficiency score even if all the other units adjust their inputsat the lower bounds. Respectively, rj0 ¼ 1 means that the unit j0 preserves its efficiency score when all theother units adjust their outputs at the upper bounds. Thus we can use the endurance indices, separately forinputs and outputs, or an aggregate value of them (the average or a weighted average for example) to rankthe efficient units in Eþ. Note here that the formulae given above for computing the endurance indices arewritten for the case where the entries in the n� m and n� s tables of input and output data, respectively, areall interval numbers (i.e. xUij � xLij 6¼ 0 and yUrj � yLrj 6¼ 0 for all i; r; j). In case of mixtures of interval and exactdata, the summations apply only to the interval data and the average is taken accordingly.

2.4. Numerical example

To illustrate the above, consider the interval data setting of Table 1 (5 units with 2 inputs and 2 outputs)and the efficiency scores obtained by applying models (2) and (4).

D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36 29

Page 7: Data envelopment analysis with imprecise data

Unit 5 is classified in Eþþ as it is efficient in any case. Notice that this happens without being a data-dominating unit. It dominates the other units only with respect to the second input (i.e.xU2;5 < xL2j; j ¼ 1; 2; 3; 4). According to the differences hj � hLj , the efficient units in Eþ are ranked in thefollowing order: unit 3, unit 2, unit 1. For the data of our example, the same ranking of the efficient units isderived if the endurance indices are alternatively used. In Table 2 we give the input and output levelsadjusted by model (5), when solved for unit 1. In parentheses are the corresponding values as considered,favourably for unit 1, in model (3).

3. An extension of the interval DEA model for dealing with imprecise data

Model (2) presented in the previous section deals with interval data in DEA. We extend in this sectionthe interval DEA model to deal with the more general case of imprecise data, that is mixtures of intervaland ordinal data together with exact data. We do this by taking the line of Cook et al. [5]. Ordinal data areknown only to satisfy certain ordinal relations and might be given on numerical or verbal scales. To for-mulate the general model with imprecise data, let us introduce the following notation, by which we dis-tinguish the inputs and the outputs in cardinal (exact and/or interval) and ordinal ones:

I ¼ f1; . . . ;mg: the set of indices for inputsR ¼ f1; . . . ; sg: the set of indices for outputsCI : the subset of indices for cardinal inputs (CI � I)OI : the subset of indices for ordinal inputs (OI � I ; CI [ OI ¼ I)CR: the subset of indices for cardinal outputs (CR � R)OR: the subset of indices for ordinal outputs (OR � R; CR [ OR ¼ R)

Model (6) below deals with both cardinal (exact and/or interval) and ordinal data.

max hj0 ¼Xr2CR

uryLrj0 þ prj0ðyUrj0 � yLrj0Þ þXr2OR

prj0

s:t:Xi2CI

vixLij0 þ qij0ðxUij0 � xLij0Þ þXi2OI

qij0 ¼ 1;

Xr2CR

uryLrj þ prjðyUrj � yLrjÞ þXr2OR

prj �Xi2CI

vixLij þ qijðxUij � xLijÞ �Xi2OI

qij 6 0; j ¼ 1; . . . ; n;

prj � ur 6 0; r 2 CR; j ¼ 1; . . . ; n;

qij � vi 6 0; i 2 CI ; j ¼ 1; . . . ; n;

ur; vi P e; r 2 CR; i 2 CI ;

prj P 0; qij P 0 8 r; i; j;

ð6Þ

Table 2

Benevolent and aggressive data adjustment for the evaluated unit 1

DMU Inputs Outputs

j �xx1j �xx2j �yy1j �yy2j

1 12 (12) 0.21 (0.21) 144 (144) 22 (22)

2 14.15 (17) 0.1 (0.7) 143 (143) 28 (28)

3 12 (12) 0.16 (0.35) 157 (157) 21 (21)

4 19 (22) 0.12 (0.19) 181 (158) 25 (21)

5 14.54 (15) 0.06 (0.09) 157 (157) 28 (28)

q1 ¼ 0:73 r1 ¼ 0:25

30 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36

Page 8: Data envelopment analysis with imprecise data

ordinal relations among fprj; j ¼ 1; . . . ; ng; r 2 OR;

ordinal relations among fqij; j ¼ 1; . . . ; ng; i 2 OI :

Cardinal inputs and outputs are formulated exactly as in model (2) and this is reflected in the firstsummation terms in the composite inputs and outputs. Ordinal inputs and outputs are represented in thecomposite inputs and outputs, respectively, by ordinal variables with a unit coefficient (cf. second sum-mation terms in composite inputs and outputs). Moreover, n� 1 constraints are introduced, for each or-dinal input and output, that preserve the ordinal relations defined on the units by the ordinal data. Toclarify this issue, consider for example an ordinal output yr and two units, say k and l, with consecutiveranks according to yr. If unit k possesses a highest rank than unit l, then the constraint that preserves thisordinal relation takes the form prk � prl P d, where d is a small positive number. In case of ties (i.e. units kand l possess the same rank), the constraint is written as prk � prl ¼ 0.

Model (2) derives as a special case of model (6) when only cardinal data are taken into consideration (i.e.when OR ¼ ;, OI ¼ ;, CR ¼ R, CI ¼ IÞ. Indeed, in the absence of ordinal inputs and outputs in model (6),the second summations in the composite inputs and outputs are vanished and the ordinal relations areexcluded as non-applicable. Upper and lower bound efficiencies, as formulated in models (3) and (4), withthe classification of units in Eþþ; Eþ and E� are also applicable in the imprecise data setting, provided thatthe models are expanded, as in model (6), to incorporate the ordinal data. The same holds for the be-nevolent formulation of model (5) and the endurance indices.

3.1. Numerical example

As an illustration and for comparison purposes we applied model (6) to the example given in [6] andpresented in Table 3. Five units are considered with a mixture of imprecise and exact data (two inputs – oneexact and one interval – and two outputs – one exact and one ordinal).

The entries of the ordinal output must be read as follows: 5¼ highest rank, . . ., 1¼ lowest rank. Model(6) was implemented in an MS-Excel worksheet and was solved by using the Excel Solver. Two sets ofresults were obtained: one for e ¼ 10�10, d ¼ 10�6 and one for e ¼ 10�8, d ¼ 10�6 (those in parentheses).The corresponding results for IDEA are as given in [6]. Notice that the results obtained by both the IDEAand our model (6) are identical in a five-digit accuracy. In Appendix B we give the details of the linearprogram solved for unit 1 together with the adjusted input/output data and the values of the ordinalvariables.

Table 3

Exact and imprecise data [6]

DMU Inputs Outputs Efficiencies

j Exact Interval Exact Ordinal IDEA Model (6)

x1j x2j Y1j y2j

1 100 0.6 0.7 2000 4 1 1

(1) (1)

2 150 0.8 0.9 1000 2 0.87500 0.87500

(0.87499) (0.87499)

3 150 1 1 1200 5 1 1

(1) (1)

4 200 0.7 0.8 900 1 1 1

(0.99999) (0.99999)

5 200 1 1 600 3 0.70000 0.70000

(0.69999) (0.69998)

D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36 31

Page 9: Data envelopment analysis with imprecise data

4. Other extensions

The models presented in the previous sections assume that the upper and lower bounds of the intervaldata are given as constants. In such an interval data setting, the inefficient units (those in E�) might becomeefficient if changes are made to the upper or lower bounds of their interval data. We investigate in thissection that issue, for one unit and one input at a time (the case of an output instead of an input derivesstraightforwardly). Assume that unit j0, when evaluated by model (2), is inefficient. When inputs areconcerned, unit j0 could be turned to an efficient one by reducing further a particular input (say input k),below the lower bound at a level-threshold of efficiency �xxkj0 . As it is �xxkj0 ¼ xLkj0 þ skj0ðxUkj0 � xLkj0Þ, we canadjust �xxkj0 below xLkj0 if we let skj0 ¼ qkj0=vk take negative values. Since vk > 0, it is sufficient to let qkj0 free totake negative values. We are then concerned in estimating the maximum �xxkj0 that turns the inefficient unit j0to an efficient one. This value can be achieved by estimating qkj0 and vk that maximize skj0 ¼ qkj0=vk. Themodel below accomplishes this:

max z

s:t: ðu; v;Q; P Þ 2 S;

qkj0 � zvk P 0;

ð7Þ

where u ¼ ður; r ¼ 1; . . . ; sÞ, v ¼ ðvi; i ¼ 1; . . . ;mÞ, Q ¼ ðqij, i ¼ 1; . . . ;m; j ¼ 1; . . . ; nÞ, and P ¼ ðprj,r ¼ 1; . . . ; s; j ¼ 1; . . . ; nÞ are the decision variables in vector form, the variable z represents the maximalvalue of the ratio qkj0=vk and S is the solution space formed by the following set of constraints:

Xmi¼1

vixLij0 þ qij0ðxUij0 � xLij0Þ ¼ 1;

Xs

r¼1

uryLrj0 þ prj0ðyUrj0 � yLrj0Þ �Xmi¼1

vixLij0 þ qij0ðxUij0 � xLij0Þ ¼ 0;

Xs

r¼1

uryLrj þ prjðyUrj � yLrjÞ �Xmi¼1

vixLij þ qijðxUij � xLijÞ6 0; j ¼ 1; . . . ; n ðj 6¼ j0Þ;

prj � ur 6 0; r ¼ 1; . . . ; s; j ¼ 1; . . . ; n;

qij � vi 6 0; i ¼ 1; . . . ;m; j ¼ 1; . . . ; n;

ur; vi P e 8 r; i;prj P 0 8 r; j;qij P 0 8 i 6¼ k; j 6¼ j0;

qkj0 free:

ð8Þ

Model (7) is non-linear due to the last constraint. However, it is possible to solve it, by resorting to standardLP software, with a two-stage procedure, as follows:

Stage 1. We solve the linear program

max qkj0s:t: ðu; v;Q; P Þ 2 S:

ð9Þ

If qokj0 ; vok are the values of the variables qkj0 ; vk in the optimal solution of (9), then the ratio qokj0=vok is a value

of z for which the unit j0 becomes efficient as it satisfies, among others, the first two constraints of (8). Onthe other hand, z < 0 as qkj0 < 0. So the optimal (maximum) value of z will lie in the bounded interval½ðqokj0=v

okÞ; 0�.

32 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36

Page 10: Data envelopment analysis with imprecise data

Stage 2. On the basis of model (7), we perform bisection search [7] in the interval ½ðqokj0=vokÞ; 0� as follows. Let

z be a value of z for which the constraints of model (7) are consistent (initially z ¼ qokj0=vokÞ and �zz a value of z

for which the constraints of (7) are not consistent (initially �zz ¼ 0). Then the consistency of the constraints isinvestigated for z0 ¼ ðzþ �zzÞ=2. If they are consistent z0 will replace z; if they are not it will replace �zz. Thebisection is continued until z and �zz come sufficiently close to each other, at a desirable degree of accuracy.We end this iterative stage with z ¼ z ffi �zz and ðu; v;Q; P Þ that is an optimal solution of model (7) (i.e.z ¼ qkj0=v

k). The threshold of efficiency �xxkj0 derives then from

�xxkj0 ¼ xLkj0 þqkj0vk

ðxUkj0 � xLkj0Þ:

As an illustration consider the data set of Table 1 in Section 2.4. When the model (7) is applied to thecase of the inefficient unit #4 with respect to the first input, we get in its optimal solution q1;4 ¼ �0:034154and v1 ¼ 0:054083, from which it derives that the level of the first input should be adjusted to �xx1;4 ¼ 17:105.This value is the threshold of efficiency with respect to the first input. Similarly, the threshold of efficiency ofthe unit #4, with respect to the second input, is �xx2;4 ¼ 0:1038 ðq2;4 ¼ �2:236360, v2 ¼ 9:637718Þ.

Model (7), extended to incorporate ordinal inputs as described in the previous section, is also applicablein the case of imprecise data. In such a case, evidently a cardinal input should be selected for consideration.Applying for example model (7), with the necessary modifications, to the data of Table 3 derives that theinefficient unit #2 turns to an efficient one, if the second input is adjusted at �xx2;2 ¼ 0:7 (the correspondingoptimal values are q2;2 ¼ �1:429 and v2 ¼ 1:429).

A possible application of the models presented throughout the paper is to the problem of locating ‘‘bestbuys’’ in a market of competitive products or services (i.e. estimating the products that offer great ‘‘value-for-money’’). The term ‘‘value’’ represents a composite measure of what the user gets from a product orservice, while the term ‘‘money’’ is for what the user pays for it. In the absence of individual preferences,this problem can be formulated as an imprecise DEA model with a single input (price) and multiple outputs(indices and performance estimates). In accordance with the DEA terminology, efficient units (products)are those that are worth their price. As the price of a particular product can vary from dealer to dealer ordue to different discount policies, it is usually known to lie within a bounded interval, between the lowestand the highest price recorded in the market. On the other hand, product outputs may be measured eitherby exact values or bounded intervals (overall evaluation judgments made by different experts, for example)or ordinal variables (product ratings obtained by experts or surveys). In such a decision situation, themodels discussed in this paper can determine the products that constitute best buys (efficient units) andthose that do not (inefficient units). The interval form of the price may uncover products that constitutebest buys even at their highest price (i.e. products in Eþþ class). The endurance indices on the other hand,estimated for the products in Eþ class, measure the degree to which these products maintain their positionas best buys in a low price–high performance competitive environment. Finally, model (7) can determinethe price thresholds for the inexpedient products that step them up in the class of best buys.

5. Conclusion

We developed in this paper an alternative approach for dealing with imprecise data in DEA. As inIDEA, we transform a non-linear DEA model to a linear programming equivalent. We do it, however, in acompletely different way. Contrarily to IDEA, our transformations on the variables differ and are made onthe basis of the original data set, without applying scale transformations on the data first. Due to ourtransformations, the classical CCR DEA model with exact data, in its multiplier form, derives straight-forwardly as a special case of our imprecise DEA model. It is our specific formulation, however, thatenables us to proceed further, beyond our basic imprecise DEA models. We show that a DEA model with

D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36 33

Page 11: Data envelopment analysis with imprecise data

interval data can be treated as a peculiar DEA model with exact data, where the coefficients of the mainbody of the constraints in the associated LPs change when a new unit comes for evaluation. We defineupper and lower bounds for the possible efficiency scores that a unit might attain in an imprecise datasetting. We then use these bounds to classify the units. Proceeding still further, we develop a post-DEAbenevolent formulation by which we evaluate the degree to which a unit can maintain its efficiency score ina favourable, for the other units, data setting. We exploit the results of the model to calculate enduranceindices for the units in order to discriminate further among the efficient units. Always based on the specifictransformations and formulations introduced in this paper, we address the problem of determining, in aninterval or imprecise data setting, input thresholds that turn an inefficient unit to an efficient one.

Besides the advantage we gain from our formulations in addressing the issues summarised above, themodels we formulate are more efficient in implementation, in comparison to IDEA. Indeed, contrarily toour model, an IDEA model must in general be reconstructed when changes are made to the data (forexample, when new units come into consideration or some units are removed). This is due to the scaletransformations employed on the data, necessary in IDEA in order to identify a unity element, maximal ineach input or output data column, which then is used as the basis for variable alterations. Our model is freeof such limitations and it can be easily implemented, even in a spreadsheet environment.

Appendix A

If hj0 and hUj0 are the efficiencies of unit j0 obtained, respectively, by models (2) and (3), it is always

hj0 ¼ hUj0 .First note that if u ¼ ður; r ¼ 1; . . . ; sÞ, v ¼ ðvi; i ¼ 1; . . . ;mÞ is a feasible solution of model (3), then the

augmented solution ðu; v;Q; P Þ, withQ ¼ ðqij; i ¼ 1; . . . ;m; j ¼ 1; . . . ; nÞ; P ¼ ðprj; r ¼ 1; . . . ; s; j ¼ 1; . . . ; nÞ

and

qij ¼0; j ¼ j0;vi; j 6¼ j0;

�prj ¼

0; j 6¼ j0;ur; j ¼ j0;

is a feasible solution of model (2) and Hj0ðu; vÞ ¼ hj0ðu; v;Q; P Þ. Indeed, model (3) derives straightforwardlyfrom model (2) by applying Q, P on it.

On the basis of the input–output data considered in model (3), unit j0 is in the best possible position vis-�aa-vis the other units and thus hUj0 is the highest possible efficiency score that this unit can attain. So it ishj0 6 hUj0 . Let now uo ¼ ðuor ; r ¼ 1; . . . ; sÞ and vo ¼ ðvoi ; i ¼ 1; . . . ;mÞ be the optimal solution of model (3)(i.e. Hj0ðuo; voÞ ¼ hUj0 ). Then the augmented solution ðuo; vo;Q; P Þ, with Q and P as defined above, is afeasible solution of model (2) and, therefore, it is hj0ðuo; vo;Q; P Þ ¼ Hj0ðuo; voÞ ¼ hUj0 6 hj0 . Thus h

j0¼ hUj0 .

Table 4

Adjusted data for unit 1

DMU Inputs Outputs

j x1j x2j y1j y2j

1 100 0.6 2000 4 (3E) 06)

2 150 0.8 1000 2 (1E) 06)

3 150 1 1200 5 (4E) 06)

4 200 0.7 900 1 (0E+00)

5 200 1 600 3 (2E) 06)

34 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36

Page 12: Data envelopment analysis with imprecise data

Table 5

The linear program deriving from model (6) for unit 1

# v1 v2 u1 q11 q12 q13 q14 q15 q21 q22 q23 q24 q25 p11 p12 p13 p14 p15 p21 p22 p23 p24 p25 RHS

1 )100 )0.6 2000 )0.1 1 <¼ 0

2 )150 )0.8 1000 )0.1 1 <¼ 0

3 )150 )1 1200 1 <¼ 0

4 )200 )0.7 900 )0.1 1 <¼ 0

5 )200 )1 600 1 <¼ 0

6 100 0.6 0.1 ¼ 1

7 )1 1 <¼ 0

8 )1 1 <¼ 0

9 )1 1 <¼ 0

10 )1 1 <¼ 0

11 )1 1 <¼ 0

12 )1 1 <¼ 0

13 )1 1 <¼ 0

14 )1 1 <¼ 0

15 )1 1 <¼ 0

16 )1 1 <¼ 0

17 )1 1 <¼ 0

18 )1 1 <¼ 0

19 )1 1 <¼ 0

20 )1 1 <¼ 0

21 )1 1 <¼ 0

22 )1 1 >¼ d23 1 )1 >¼ d24 )1 1 >¼ d25 1 )1 >¼ d

2000 1 max obj

v1; v2; u1 P e, qij P 0, i ¼ 1; 2; j ¼ 1; . . . ; 5, prj P 0, r ¼ 1; 2; j ¼ 1; . . . ; 5.

D.K.Despotis,

Y.G.Smirlis

/EuropeanJournalofOperationalResea

rch140(2002)24–36

35

Page 13: Data envelopment analysis with imprecise data

Appendix B

Table 4 presents the input/output data adjusted by model (6) when it was solved for the evaluated unit 1,together with the values of the ordinal variables (in parentheses in the last column).

In Table 5 we give the details of the linear program solved for the evaluated unit 1. Notice that whenanother unit comes for evaluation, only the objective function and the constraint #6 change accordingly.

References

[1] A.I. Ali, W.D. Cook, L.M. Seiford, Strict vs. weak ordinal relations for multipliers in data envelopment analysis, Management

Science 37 (1991) 733–738.

[2] A. Charnes, W.W. Cooper, A.Y. Lewin, L.M. Seiford, Data Envelopment Analysis: Theory, Methodology and Applications,

Kluwer Academic Publishers, Norwell, MA, 1994.

[3] W.D. Cook, J. Doyle, R. Green, M. Kress, Multiple criteria modeling and ordinal data: Evaluation in terms of subsets of criteria,

European Journal of Operational Research 98 (1997) 602–609.

[4] W.D. Cook, M. Kress, L. Seiford, On the use of ordinal data in data envelopment analysis, Journal of the Operational Research

Society 44 (1993) 133–140.

[5] W.D. Cook, M. Kress, L. Seiford, Data envelopment analysis in the presence of both quantitative and qualitative factors, Journal

of the Operational Research Society 47 (1996) 945–953.

[6] W.W. Cooper, K.S. Park, G. Yu, IDEA and AR-IDEA: Models for dealing with imprecise data in DEA, Management Science 45

(1999) 597–607.

[7] D.K. Despotis, Fractional minmax goal programming: A unified approach to priority estimation and preference analysis in

MCDM, Journal of the Operational Research Society 47 (1996) 989–999.

[8] J. Doyle, R. Green, Efficiency and cross-efficiency in DEA: Derivations, meanings and uses, Journal of the Operational Research

Society 43 (1994) 567–578.

[9] R.G. Dyson, E. Thanassoulis, Reducing weight flexibility in data envelopment analysis, Journal of the Operational Research

Society 39 (1988) 563–576.

[10] B. Golany, A note on including ordinal relations among multipliers in data envelopment analysis, Management Science 34 (1988)

1029–1033.

[11] Y. Roll, W.D. Cook, B. Golany, Controlling factor weights in data envelopment analysis, IIE Transactions 23 (1991) 2–9.

[12] J. Sarkis, S. Talluri, A decision model for evaluation of flexible manufacturing systems in the presence of both cardinal and ordinal

factors, International Journal of Production Research 37 (1999) 2927–2938.

[13] T.R. Sexton, R.H. Silkman, A.J. Hogan, Data envelopment analysis: Critique and extensions, in: R.H. Silkman (Ed.), Measuring

Efficiency: An Assessment of Data Envelopment Analysis, Jossey-Bass, San Francisco, CA, 1986, pp. 73–105.

[14] R.G. Thompson, L.N. Langemeier, C.T. Lee, R.M. Thrall, The role of multiplier bounds in efficiency analysis with application to

Kansas farming, Journal of Econometrics 46 (1990) 93–108.

36 D.K. Despotis, Y.G. Smirlis / European Journal of Operational Research 140 (2002) 24–36