preprocessing dea.jdula/workingpapers/preprocessingdea.pdf · an adaptation to the dea constant...

Preprocessing DEA.

J.H. Dula1 and F.J. Lopez2

February 2007

Statement of Scope and Purpose. This is a comprehensive study of preprocessing in DEA. The purpose is

to provide tools that will reduce the computational burden of DEA studies especially in large scale applications.

Abstract. We collect, organize, analyze, implement, test, and compare a comprehensive list of ideas for pre-

processors for entity classification in DEA. We limit our focus to procedures that do not involve solving LPs. The

procedures are adaptations from previous work in DEA and in computational geometry. The result is five preprocess-

ing methods three of which are new for DEA. Testing shows that preprocessors have the potential to classify a large

number of DMUs economically making them an important computational tool especially in large scale applications.

Key Words: DEA, DEA computations, linear programming, and computational geometry.

1. Introduction. The paper by Charnes et al. [1] which introduced DEA also offered the first

linear program (LP) formulation for classifying and scoring DMUs. The approach proposed there,

and the standard practice to this day, is to formulate and solve one LP for each entity with the LP’s

size determined by the dimensions of the matrix generated by the full data set. This approach for

efficiency classification and scoring in DEA is computationally intensive. Although much is known

about accelerating the process, solving LPs places heavy computational demands on the process

especially in large scale applications.

A preprocessor in DEA is a procedure that can quickly, efficiently, and conclusively classify

and/or score a DMU without solving an LP. This excludes methods that reduce computational

requirements which somehow involve solving LPs either by accelerating their performance or ex-

tracting from them opportunistic information for classification. Preprocessors are not expected to

1 Corresponding author, School of Business, Virginia Commonwealth University, Richmond, VA 23284, [email protected] College of Business Administration, University of Texas at El Paso, El Paso, TX 79968, [email protected].

vcuschoolofbusiness

Text Box

February 20, 2007 This is the version submitted to Computers & Operations Research.

Dula & Lopez

conclusively classify or score all the DMUs in a DEA study. They are intended to either reduce the

total number of LPs that will eventually have to be solved and/or to reduce their size so that they

can be solved faster. Preprocessors have a long tradition in DEA computations which includes the

works by Sueyoshi and Chang [2], Sueyoshi [3], and Ali [4].

In the current work, we collect and analyze the methods that have been proposed for prepro-

cessing in DEA and introduce three new ones. Two of the new ones are adaptations of work in the

related field of computational geometry and are used to identify efficient DMUs. The third prepro-

cessor, HyperTwist, is in a category of preprocessors based on hyperplane rotation for uncovering

efficient DMUs. This category of preprocessors appears here first. We classify, formalize, analyze,

implement, and compare the different preprocessors.

2. The Role of preprocessors in DEA. The elements of a DEA study are: i) a model; i.e.,

a list of inputs and outputs that characterize the process, ii) a data set of the DMUs’ values for

the attributes in the model, and iii) a returns to scale assumption about the transformation pro-

cess. These elements define a production possibility set: the set of all viable inputs and outputs

obtainable from all combinations of the data along with all possibilities from the free disposability

consequences of the returns to scale assumption. The production possibility set is a convex poly-

hedral set with a portion of its boundary constituting the efficient frontier. A DMU is efficient if

and only if it belongs to the efficient frontier of its production possibility set.

One of the main objectives of any DEA study is the classification of entities as efficient or

inefficient. This classification depends only on the three fundamental components above. A DEA

study may also require the calculation of a “score” for each DMU and associated benchmarking

information when these are inefficient. An entity’s score is the objective function value of an LP

and provides information about its relative position with respect to the efficient frontier. Different

LP formulations provide different scores and benchmarks. Although scores are used in practice

to classify DMUs, the solutions may not provide sufficient conditions for classification as in the

case of the relaxed input and output oriented LP formulations. Scoring is not required in all DEA

studies; evidence of this is the use of the familiar additive LP formulation of Charnes et al. [5].

These LPs provide necessary and sufficient conditions for classification but their scores are mostly

useless since they maximize the 1-norm to the efficient frontier (Briec, [6]).

“Preprocessing DEA.”

All the preprocessors discussed here are used to classify DMUs. With the exception of studies

requiring super-efficiency scores, advanced classification of an efficient DMU with an inexpensive

preprocessor saves having to solve an LP altogether for that entity since it is automatically classified

and scored. Any inefficient DMU that is classified by a preprocessor also obviates an LP solution

if scores are not required. The value of an efficient, effective, and economical preprocessor in these

situations is evident.

Low cost classification of inefficient DMUs can also save work in the event that scores and

benchmarks are required. If a DMU is inefficient, its data point can be omitted from the data

matrix of any LP used for any remaining classifications and scoring. The technique based on this

result is called Reduced Basis Entry (RBE) (Ali [4]). In experimental work by Barr and Durchholz

[7] and Dula [8] it has been shown that RBE can reduce computations in DEA by 50%. Therefore,

the time to solve the smaller LPs that result after employing preprocessors is much less than half.

This suggests that knowledge of inefficient DMUs prior to having to solve any LPs can result in

substantial computational savings especially when relatively many are identified cheaply in large

problems where the proportion of efficient to inefficient DMUs (“density”) is low.

3. Notation and assumptions. A point set consists of n points aj , j = 1, . . . , n each with m

dimensions. The set A collects the data points; i.e., A = {a1, . . . , an}. The ith coordinate of aj is

denoted by aji . DEA points are composed of two parts, as follows:

aj =

[−Xj

Y j

]∈ �m; j = 1, . . . , n;

where 0 �= Xj ≥ 0 and 0 �= Y j ≥ 0 are the input and output data vectors, respectively, for DMU

j. We assume that this set is reduced in the sense that no point is duplicated. We denote by

H(π, β) = {y|〈π, y〉 = β} the hyperplane with normal vector π and level value β, where 〈·, ·〉 is the

inner product of two vectors.

Our development focuses on the VRS production possibility set of DEA. The results are true

for general polyhedral sets and therefore for the other DEA production possibility sets, if not

immediately, with minor adjustments.

Dula & Lopez

4. Background and preprocessing principles. Classifying a DMU as efficient or inefficient is

essentially equivalent to identifying boundary and interior points in a finitely generated polyhedral

set; that is, a polyhedron defined by linear combinations of the elements of a point set (Dula and

Lopez [9]). In DEA, the point set is composed of the data for n DMUs each characterized by m

attribute values. The polyhedral set they generate, depending on the returns to scale assumption,

is the production possibility set. General polyhedral sets can have many shapes. They range from

an unbounded polyhedron, as in DEA, to a fully bounded polytope, such as the convex hull of the

point set.

The problem of identifying boundary and/or interior points in finitely generated polyhedral

sets appears in other areas. It is familiar in computational geometry, point depth analysis in

nonparametric multivariate statistics, redundancy in systems of linear inequalities, and stochastic

programming (see Dula and Lopez [9] for more details and references). Ideas for preprocessing

convex hulls to identify extreme points appear in Dula et al. [10]. Many results from these areas

are available to DEA, but more relevant here will be the work on preprocessing for convex hulls.

Previous research on preprocessors for finitely generated polyhedral sets comes mainly from two

sources from two different backgrounds; DEA and computational geometry. Dula et al. [10] pro-

poses preprocessors to identify extreme points – an important class of boundary points, specifically

for convex hulls. The procedures in [10] incorporate a variety of ideas ranging from simple sortings

to calculating Euclidean distances to techniques based on inner products. Sueyoshi and Chang

[2], Sueyoshi [3], and Ali [4] propose preprocessors specifically for DEA. Sueyoshi and Chang [2]

introduce the concept of domination to identify inefficient DMUs. Ali [4] also looks at domination

and proposes simple and effective preprocessors for identifying efficient DMUs based on sorting

and inner products.

Existing preprocessors can be classified as: i) approaches that identify boundary points of

polyhedral sets, of which extreme points are especially important (e.g., efficient and extreme-

efficient DMUs in DEA), and ii) methodologies that identify interior points (e.g., inefficient DMUs

in DEA). In the first category there are three basic ideas:


Sorting. The points with the maximum value in each dimension in the VRS DEA model are

extreme-efficient if unique. If not unique, they correspond to DMUs on the boundary, which

means they may or may not be efficient (e.g., weak efficient). These points are identified by

sorting the dimensions. The computational effort for this is minimal and has the potential

of identifying m efficient DMUs in DEA. This idea has been used in Dula et al. [10] in the

context of general convex hulls, and in Ali [4] in the context of DEA.

An adaptation to the DEA constant returns to scale model appears in Shaheen [11]. A

DMU that generates a unique minimum ratio of its components with a projection’s norm is

necessarily an extreme ray of the production possibility set. At most m new CRS extreme-

efficient DMUs can be identified this way.

Norm maximization. Dula et al. [10] propose identifying extreme points of the convex hull

by finding the element in A that maximizes the Euclidean distance to an arbitrary point, p,

in �m. This is a special case of a more general result given next:

Result 1. Let aj = argmax{‖p − aj‖�; j = 1, . . . , n

}where ‖ · ‖� is the �-norm of the argu-

ment. If aj is unique then it is an extreme point of the convex hull of A.

Proof. See Appendix 1.

The procedure in [10] is based on an application of Result 1 for the case of the 2-norm. In this

norm, ties reveal additional extreme points of the convex hull. This will be true of any norm

where support sets for a ball must contain exactly a single point which is not the case of the

1-norm and ∞-norm. Result 1 can be adapted to a DEA VRS production possibility set to

identify extreme-efficient DMUs when the focal point p is judiciously chosen. For example,

for the case of the 2-norm, one such point is the “worst virtual DMU:” pi = min{aji ; j =

1, . . . , n};∀i. Notice that the ∞-norm identifies one of the points revealed through sorting;

namely the one with the overall largest component and, as we will see below, an idea proposed

by Ali in [4] is an application of Result 1 to the 1-norm. Computations for a procedure based

on norm maximization involve calculating and sorting inner products. Result 1 is a new

contribution to computational geometry and DEA.

Dula & Lopez

Hyperplane translation. The values of the inner products of the data points in a point set,

A, with some arbitrary vector π �= 0 attain a maximum value β∗ at a point, say aj . The

vector π and the value β∗ define a hyperplane H(π, β∗), which supports the convex hull of

A at aj . If aj is the unique support element, this suffices to conclude that it is an extreme

point. When there is more than one support element, all are boundary points. By restricting

π > 0, the maximum inner product, β∗, of the data points with this positive normal defines

a hyperplane, H(π, β∗), that supports the VRS production possibility set. If the support set

is a singleton then the corresponding DMU is extreme-efficient. If the support set contains

more points, all DMUs participating in the tie will be efficient. Since π > 0, the support set

cannot contain weak efficient DMUs. This idea can be visualized as translating a hyperplane

in the direction of its normal through the polyhedral hull until it reaches the last point.

The computational effort involves inner products and sorting. The idea for convex hulls was

implemented and tested in Dula et al. [10]. Ali’s [4] “output/input aggregation ratio” (Lemma

3, p. 64) is a special case where a specific hyperplane with normal vector π = (1, . . . , 1) is

used. (This is also an application of Result 1 using the 1-norm). The new preprocessor based

on hyperplane translation proposed below specifically for DEA allows the use of multiple

hyperplanes restricted only by the condition π > 0 on their normals.

The second category of preprocessors identifies inefficient DMUs. We are aware of only one

approach in this category.

Domination. DMU j totally dominates DMU j if aj ≥ aj . The dominated DMU is clearly

inefficient. A full implementation of this preprocessor involves at most a quadratic number

of vector subtractions and comparisons. This has the potential of identifying a large number

of inefficient DMUs. The procedure applies to all return to scale assumptions. This idea was

originally proposed by Sueyoshi and Chang [2].

In the next section we implement the preprocessing ideas described above for the case of the

VRS DEA model. In addition, we introduce two new preprocessors for identifying efficient DMUs

in DEA. The first, HyperTran, is an adaptation of the hyperplane translation procedure for convex

hulls from Dula et al. [10]. The second, HyperTwist, falls into an entirely new category based on


hyperplane rotation. It results from adapting a procedure introduced in Lopez and Dula [12] to

assess the impact of adding a new attribute to a DEA model.

5. Implementations. The preprocessing principles discussed in Section 4 can be the basis

for a variety of methods for preprocessing in DEA. Methods can be deterministic or probabilistic,

parameter-dependent or parameter-free, etc. In the spirit of consistency we propose methods based

on the following guidelines: i) they are deterministic; ii) they do not depend on defining parameters

that require experimental tuning ; and iii) the number of computations are bounded. The methods

produced under these guidelines are not affected by convergence issues or other stopping criteria.

This provides consistency across the different preprocessing principles and facilitates comparisons.

We design five methods specifically for VRS DEA. The first two, DimensionSort and Dominator,

have been used in DEA before. The other three preprocessors, MaxEuclid, HyperTran, and

HyperTwist, are new to DEA.

The DMUs in the data set are classified as follows:

E is the set of efficient DMUs.

E∗ is the set of extreme-efficient DMUs. Note E∗ ⊆ E .

I is the set of inefficient DMUs.

Each method is formally presented below. The pseudo-codes are followed by a discussion.

Dula & Lopez

Sorting Method DimensionSort

Procedure: DimensionSort

[INPUT:] A.

[OUTPUT:] E∗ ⊆ E∗ ⊆ A.Initialization. E∗ ← ∅.

For i = 1 to m, Do:

j∗ = argmaxj

aji ; j = 1, . . . , n.

If j∗ is unique, Then,

E∗ ← E∗ ∪ {aj∗}.Else, Resolve Ties

(Resolve ties by applying this procedure

recursively to the points involved in the

ties on the remaining dimensions).

End if.

Next i.

Finalization. E∗ contains extreme-efficient DMUs.

Notes on Procedure DimensionSort.

1. Procedure DimensionSort identifies extreme-efficient DMUs. Each DMU that emerges from

a pass will have at least one component which is the largest in its dimension. This is sufficient

to conclude that the corresponding point is extreme in the production possibility set. This

is guaranteed by the tie resolution procedure and by the no duplication assumption. The

number of different extreme-efficient DMUs that this method identifies is at most m.

2. Each of the m iterations requires sorting n values. Note that tie resolution can be invoked as

many as m − 1 times within each iteration with possibly as many as n points participating

in each tie.


Domination Method: Dominator.

Dominator

[INPUT:] A.

[OUTPUT:] I ⊆ I ⊆ A.Initialization. I ← ∅.

For j = 1 to n, Do:

For k = 1 to n, k �= j, Do:

If ak is still unclassified,

If aj ≥ ak,

I ← I ∪ {ak}.End if.

End if.

Next k.

Next j.

Finalization. I contains inefficient DMUs.

Notes on Procedure Dominator.

1. Procedure Dominator identifies exclusively inefficient DMUs including weak efficient. Dominator

has the potential to identify a large subset of the inefficient DMUs.

2. The implementation is enhanced by omitting from subsequent comparisons points that are

identified as inefficient. This means we can expect to handle fewer and fewer points as the

procedure progresses.

3. Decomposition Schemes. Points in the interior of polyhedral sets defined by subsets of the

data are also interior to the polyhedron generated by the complete point set. An effective

preprocessor can be based on partitioning blocks to identify inefficient DMUs and repeating

this with new blocks composed of entities with unknown status until a final single block

is processed. This decomposition approach can be designed using Dominator to identify

inefficient DMUs within blocks. Since an implementation requires a decision about the size of

the initial and intermediary blocks, it will involve experimental tuning disqualifying it from

further consideration in this article.

Dula & Lopez

4. The procedure requires at most (n − 1)2 comparisons. The enhancement may reduce this

substantially.

Euclidean Distance Method: MaxEuclid

Procedure: MaxEuclid

[INPUT:] A.

[OUTPUT:] E∗ ⊆ E∗ ⊆ A.Initialization. E∗ ← ∅.

For i = 1 to m, Do:

pi = minj

aji.

Next i.

j∗ = argmaxj

〈aj − p, aj − p〉; j = 1, . . . , n.

E∗ ← E∗ ∪ {aj∗}.

Finalization. E∗ contains extreme-efficient DMUs.

Notes on Procedure MaxEuclid.

1. Procedure MaxEuclid applies a special case of Result 1 above. Although other norms may

reveal different extreme-efficient DMUs, we anticipate that the same maximizer would emerge

in many of these.

2. Only extreme points maximize the Euclidean distance from a properly selected focal point.

Although the convex hull of the data when the focal point, p, is included may generate

extreme points that are not extreme-efficient DMUs, Result 2 in Appendix 1 demonstrates

how any such points cannot maximize the 2-norm. For this reason, Procedure MaxEuclid

only uncovers extreme-efficient DMUs and, in case of ties, all points are extreme-efficient

although not necessarily on a common face. As many as all the extreme-efficient DMUs

could be identified by MaxEuclid if they are located on the boundary of an m-dimensional


hypersphere when the focal point p is at the center. More realistically, MaxEuclid will identify

one extreme-efficient DMU and more than that is unlikely.

3. Procedure MaxEuclid requires the calculation of a focal point. The point used by the proce-

dure, p, can be interpreted as a “worst virtual DMU”. Other focal points are possible. For

example, focal points can be located such that the point set belongs to the positive orthant

that the focal point determines. Some sort of parameter needs to be defined to generate a

sequence of useful focal points that are sufficiently separated from each other so as to result

in the identification of different extreme points. This parametric dependence violates our

guidelines.

4. This implementation of the 2-norm for identifying extreme-efficient DMUs requires calculating

and sorting n inner products. Note that the ordering of values obtained by the 2-norm is the

same as that of their squares.

Translating Hyperplanes Method: HyperTran.

Procedure: HyperTran

[INPUT:] A.

[OUTPUT:] E ⊆ E ⊆ A.Initialization. E ← ∅.

For i = 1 to m, Do:

pi = minj

(aji − ε).

Next i.

For j = 1 to n, Do:

πj = aj − p

j∗ = argmaxk

〈πj , ak〉; k = 1, . . . , n.

E ← E ∪ {aj∗}.Next j.

Finalization. E contains efficient DMUs.

Dula & Lopez

Notes on Procedure HyperTran.

1. Procedure HyperTran identifies efficient DMUs by translating hyperplanes until they become

supports for the production possibility set. Efficiency is assured by the fact that the normals,

πj , of the supporting hyperplanes are strictly positive. The procedure is not confounded

by weak efficiency. Every hyperplane translation will identify at least one extreme-efficient

DMU; in case of ties, all are efficient.

2. The point p used in the procedure is, again, the “worst virtual DMU” except it undergoes a

slight perturbation to assure that πj > 0. As with MaxEuclid other focal points are possible.

HyperTran has the potential to identify many of the efficient DMUs since every data point

generates a translating hyperplane.

3. The procedure requires n2 inner products and n sortings, one for each πj defined.

Rotating Hyperplanes Method: HyperTwist.

We use the following terms: u = (1, . . . , 1) ∈ �m and the �-th unit vector in �m is e�, � =

1, . . . , m.


HyperTwist

[INPUT:] A.

[OUTPUT:] E ⊆ E ⊆ A.Global Initialization. E ← ∅.

For � = 1 to m, Do:

Step 0. (Local Initialization)k = 0; π� = u − e�;j∗0 = argmax

j〈π�, aj〉; j = 1, . . . , n.

Select j∗0 such that aj∗0

� is max,

E ← E ∪ {aj∗0 }.

Step 1. (Pivot)i. k = k + 1.ii. For j = 1, . . . , n, Do:

γj =

⎧⎪⎪⎨⎪⎪⎩

〈π�,aj∗k−1 〉−〈π�,aj〉

(aj�−a

j∗k−1

� ), if (aj

� − aj∗

k−1

� ) > 0;

M (large number), otherwise.

Next j.

iii. γ∗ = minj

γj.

iv. If γ∗ = M, STOP. Otherwise go to Step 2.

Step 2. (Find RHS)

β = 〈π�, aj∗k−1〉 + γ∗a

j∗k−1

� .

Step 3.

i. Define J ={j|〈π�, aj〉 + γ∗aj

� = β}.

ii. Select j∗ such that aj∗

� is max for all j ∈ J.

E ← E ∪ {aj∗},Go to Step 1.

Next �.

Dula & Lopez

Notes on Procedure HyperTwist.

1. The derivation of the results that make this procedure work has been relegated to an appendix.

2. The procedure generates a sequence of supporting hyperplanes changing their orientation

as they visit extreme points of the production possibility set. Each change in orientation

corresponds to a pivot operation. One pass of Procedure HyperTwist generates a sequence of

supporting hyperplanes that partially wrap the production possibility set along the selected

dimension, �. The procedure performs m passes, one for each dimension. The hyperplanes

begin parallel to the �-th dimension and end orthogonal to it. In between, the hyperplanes

twist and turn as if hinged at extreme points progressively higher in the �-th dimension.

3. We can see how HyperTwist works in the example depicted in Figure 1. The figure shows the

sequence of rotating hyperplanes in one of the passes of the procedure. Here, “Output 2” is

the selected �th dimension of a three-dimensional VRS production possibility set. Each pivot

takes place at a different extreme point. In this example, we see how three extreme-efficient

DMUs are detected by the procedure.

4. The classification of aj∗0 as extreme-efficient in the local initializations is a result of the same

principles that apply for translating hyperplane methods. In the event of ties, other extreme

points among them can also be classified as efficient. If additional ties occur for the maximum

�th component, then these are all efficient.

5. In Steps 0 and 3ii, j∗ must be selected such that aj∗

� is max. If ties persist, it is convenient

to choose an extreme point among them to serve as the new pivot point. This can be done

expeditiously by identifying extreme values of the coordinates of the points in the tie.

6. In Step 3, J is the index set of all the data points on the same current supporting hyperplane

defined by normal π� +γ∗e� and level value β. If the cardinality is two, then both are extreme

points of the production possibility set and hence correspond to VRS extreme-efficient DMUs.

If the cardinality is more than two and 0 < γ∗ < M , then all the points involved in the tie

are VRS efficient.


Input

Output 1

Output 2

Input

Output 2

Initialization: First supporting hyperplaneis parallel to reference dimension: Output 2.

Output 2

Input

Output 1

Iteration 1: Supporting hyperplane after first pivot. Support set is an edge between two extreme points.

Output 2

Input

Output 2

Last iteration: Final pivot; hyperplane almost orthogonal to reference dimension.

Input

Output 1

Output 2

Input

Iteration 2: Supporting hyperplane after second pivot. Edge reveals third extreme point.

Output 2

Output 1

Input

Figure 1. One pass of HyperTwist and the sequence of rotating supporting hyperplanes.

7. Each pass of HyperTwist will identify at least one extreme-efficient DMU in its local initial-

ization. There is not, however, any guarantee that any more will be identified; the first pivot

could be the last.

8. Computational requirements for HyperTwist involve only inner products, ratios, and sortings.

Within each pass, the number of inner products per pivot is at most n although it can be

expected to decrease sharply as the pivots progress. The maximum number of pivots in a

pass is n. HyperTwist is another procedure with potential to identify many efficient DMUs.

6. Computational results. We tested the five procedures for preprocessing DEA to investigate

their performance and how this is affected by the three most important data characteristics: car-

dinality (number of DMUs), dimension (total number of attributes), and density (proportion of

efficient DMUs). We generated synthetic point sets with cardinalities 2500, 5000, 7500, and 10000

DMUs; in 5, 10, 15, and 20 dimensions; and with densities of 1%, 13%, and 25%. Note that the

largest of these files can be considered large scale problems. The combination of four cardinalities,

Dula & Lopez

four dimensions, and three densities results in 48 data files. This synthetic problem suite allows

us to control for the important DEA characteristics to obtain useful conclusions. To investigate

the performance of the procedures on real data, we applied them to a data set from the Federal

Financial Institutions Examination Council [13], which contains yearly data about commercial

banks.

The synthetic point sets were generated as follows. First, the efficient DMUs were generated as

elements of the boundary of a hypersphere in the orthant defined by the input/output mix. The

inefficient DMUs were generated by taking points on the boundary of the sphere and contracting

them, radially, using a randomly generated factor from a triangular distribution. The idea of

using this distribution was to make it more likely that the interior points would be close to the

boundary of the production possibility set. Next, each dimension (attribute) was scaled using a

factor randomly selected from a uniform distribution between 1 and 1000. This change does not

affect the density of the point set but makes the data more realistic by making it less symmetric.

The procedures were coded in Fortran. The experiments were performed on a dedicated Pentium

4 PC running at 2.66 GHz with 512 MB of RAM.

Making comparisons between preprocessors based exclusively on number of classifications may

be misleading since the corresponding cpu times tend to be quite different. We propose using as

a common measure for comparisons the yield of the procedures, defined here as the number of

classifications made per tenth of a cpu second. A classification is the identification of an inefficient

DMU in the case of Dominator or an efficient DMU in all other cases. The yields reported below

are the average of three runs. These results appear in Appendix 3.

The computational effort required by DimensionSort and MaxEuclid was hardly measurable

for most of the problems and their contribution is limited. DimensionSort identified m extreme-

efficient DMUs almost all the time (in a few cases it found fewer than m). MaxEuclid always

identified exactly one extreme-efficient DMU. In both cases the clock did not record any usable

cpu time and therefore no yields were calculated. These procedures essentially provide free classi-

fications.


For the remaining three implementations, it is useful as a baseline reference to report on results

obtained in classifying DEA efficiency and inefficiency using LPs. We processed selected point sets

with an LP formulation that starts with the full data set and applies the restricted basis entry

(RBE) enhancement described in [4], [7], [8]. (The yields reported in Table 1 for traditional DEA

studies using LPs are about twice those of unenhanced “naive” implementations ([7], [8])).

Table 1. Yield of enhanced traditional LP approach for selected problems.

Dimension Cardinality Density Yield

5 2500 1 % 19.518410 5000 13 % 2.543615 7500 13 % 0.385720 10000 25 % 0.1602

The next three preprocessors were dramatically more effective in classifying DMUs than the pre-

vious two. Even though in general it is difficult to find predictable effects given that preprocessors

are vulnerable to data peculiarities, in some instances it is possible to identify general patterns.

We analyze next these preprocessing implementations.

Our implementation of Dominator confirms what Sueyoshi and Chang [2] observed: it is a

powerful preprocessor with the potential to classify a large number of inefficient DMUs at a low

cost. Sueyoshi and Chang’s initial implementations resulted in the identification of 100% of the

inefficient DMUs. They proceeded to modify their problem generator to avoid this condition.

Our inefficient DMUs (points) were generated with this issue in mind, hence the reason for the

triangular distribution for the contraction factor in their generation. Even so, an average of 78.43%

of the inefficient DMUs were totally dominated and thus identified by Dominator.

Figure 2 illustrates representative cases of the behavior of the yield of Dominator controlling

cardinality, dimension, and density. The effect of increased cardinality is a clear decrease in the

yield of this procedure. This is to be expected. Even though the number of classifications can

be expected to increase close to linearly given the assumption that density remains the same, the

number of comparisons is almost quadratic in the number of DMUs. The impact of dimension is

also predictable. Detecting whether a DMU dominates another requires comparing all coordinates,

Dula & Lopez

2500 5000 7500 100000

100

200

300

400

500

Yield Dimension 10; Density 25%

Cardinality

05 10 15 200

200

400

600

800Cardinality 7500; Density 1%

Dimension

01 13 250

20

40

60

80

100

120

140

160Cardinality 7500; Dimension 15

Density

Yield Yield

Figure 2. Yield of Dominator.

causing computational effort to increase with the number of dimensions without additional clas-

sifications, which affects adversely the yield. The results of our experiments make it difficult to

understand the impact of density on yield. Dominator’s yield decreased as density increased from

1% to 13% to 25% when there were five dimensions. With the other dimensions, the yield increases

practically always as density goes from 1% to 13% but then decreases when density changes from

13% to 25%. Lower yields with 1% density than with 13% squares with the expectation of a greater

probability, in the latter case, of finding a dominating DMU for each dominated one during the

procedure. One might also expect, however the following effects to start to prevail and reduce

yields as density increases: 1) more DMUs are efficient and therefore undominated; and 2) there

is an erosion of the effectiveness of the enhancement since fewer dominated DMUs are removed

from the analysis as the procedure progresses. The effect is made dramatic in the limit since a

density of 100% would result in a zero yield. This suggests a relation with density where yields

initially increase due to the impact of extra efficient DMUs that dominate others but eventually

starts to decrease due to the effect of fewer available dominated entities and less advantage from

the enhancement.

The results of the implementation of HyperTran are illustrated using Figure 3. These graphs

were selected to represent what was typically observed. HyperTran also seems to respond to

cardinality and dimension more predictably than density. Increases in cardinality generate more

work for the procedure without necessarily increasing the number of DMUs classified. This means

we can expect yields to be adversely affected by this attribute and this is confirmed in our tests.

The apparent adverse effect of increases in dimension on the yield can be explained by the increase

in the amount of work in the calculation of inner products. These experiments do not allow any

useful determination about the impact of density on HyperTran’s yield. The procedure may be


2500 5000 7500 100000

5

10

15

20

25

30


Cardinality

05 10 15 200

5

10

15

20


Dimension

01 13 250

5

10

15

20


Density

Yield Yield

Figure 3. Yield of HyperTran.

sensitive to the geometry and scaling of the data. Translating hyperplanes would tend to end up at

points with the more extreme magnitude values in dimensions were the attribute units are large.

Also, extreme points may have point clouds nearby that would tend to attract a disproportionate

number of hyperplanes to themselves.

Figure 4 depicts three typical situations in our experience with HyperTwist. We can see that

increasing cardinality tends to decrease yield. The reason would be the same as with HyperTran;

that is, the number of inner products grows with the number of DMUs without necessarily a

proportional identification of efficient points. The procedure’s yield is also adversely affected by

dimension. Increasing the dimension increases the number of passes through the main loop of

the procedure and the inner products have more components. The yield of HyperTwist when

density increases tends to increase slightly or remains more or less constant. This may sound

counter intuitive since one would expect HyperTwist to encounter more extreme points in a denser

environment, which should result in noticeable yield improvements. Higher density may result in

more classifications but this is counteracted by the increase in pivots from the additional extreme

points.

HyperTran and HyperTwist are the two most complex procedures studied here and have sim-

ilar functionality – the identification of efficient DMUs. Both procedures are based on the same

supporting hyperplane principle that their support set is composed of boundary points. Because

of this, it is appropriate to compare the two.

Hyperplane Translation and HyperTwist. A comparison between HyperTran and HyperTwist

is illustrated in Figure 5. As noted above, the yield of both procedures is adversely impacted by

cardinality and dimension. Higher densities have a slight positive effect on HyperTwist but its

Dula & Lopez

2500 5000 7500 100000

100

200

300

400

500

600


Cardinality

05 10 15 200

20

40

60

80

100

120


Dimension

01 13 250

20

40

60


Density

Yield Yield

Figure 4. Yield of HyperTwist.

effect on HyperTran is not as clear. As shown in Figure 5 and what is true in general is that

HyperTwist is a more efficient preprocessor, with yields frequently one order of magnitude greater

or better than those of HyperTran.

We finish this section reporting the performance of Dominator, HyperTran, and HyperTwist on

data from the Federal Financial Institutions Examination Council [13]. Using these data we built

three problems. The first one contains 4,971 DMUs, three inputs, and four outputs; the second has

12,456 DMUs, five inputs, and three outputs; and the third includes 19,939 DMUs, six inputs, and

five outputs. The yield of the procedures along that of the LP approach, for contrast, is reported

in Table 2:

Table 2. Preprocessors (and LPs) Yield on bank data.

Problem Card Dim Dominator HyperTran HyperTwist LPs

1 4971 7 612.984 0.379 149.790 8.0752 12456 8 91.034 0.014 21.968 0.5603 19939 11 0.848 0.070 15.272 0.027

These data display the negative effects on yields of increases in cardinality and dimension ob-

served in the synthetic data sets. HyperTwist continues to outperform HyperTran on these real-

world data.

It is important to remember that these preprocessors are also useful when dealing with polyhe-

dral sets different from the DEA VRS production possibility set.


2500 5000 7500 100000

50

100

150

200

250

YieldDimension 20; Density 13%

Cardinality

05 10 15 200

20

40

60

80

100


Dimension

01 13 250

10

20

30

40


Density

Yield Yield

HyperTwist

HyperTran

HyperTwist

HyperTran

HyperTwist

HyperTran

Figure 5. Comparison between HyperTran and HyperTwist.

7. Concluding remarks. Preprocessors are an important aspect in the development of compu-

tational tools in many areas of OR/MS, especially when speed is critical in large scale applications.

Today, these types of approaches speed-up linear programming, integer programming, and count-

less specialized procedures developed for optimization and combinatorics, to make them better

able to cope with bigger and more complex problems.

The contributions of this paper are to collect, organize, analyze, implement, test, and compare

different preprocessors to classify DEA data points as efficient and inefficient. We introduce three

new preprocessors to DEA: one based on norm maximization; another on hyperplane translation,

HyperTran; and the third on hyperplane rotation, HyperTwist. We designed a total of five pre-

processing methods for the DEA variable returns to scale model. Computational results show

that the preprocessor to identify inefficient DMUs based on testing for domination, Dominator, is

highly effective. Two other preprocessors, HyperTwist and HyperTran, both based on the principle

that supporting hyperplanes identify efficient entities, produce excellent results with HyperTwist

consistently the better of the pair.

Testing compared yields, defined to be the number of DMUs classified as efficient or inefficient

per cpu time unit (tenth of a second). Testing shows that the yield of preprocessors usually

decreases when cardinality or dimension increases. The impact of changes on density, defined as

the percentage of points that are efficient, is not as clear, but it appears that yield tends to decrease

as density increases with Dominator and has little impact on HyperTwist. It is clear, though, that

the computational cost of identifying efficient or inefficient DMUs with preprocessors is cheap.

The effectiveness of preprocessors stems from the fact that they do not solve LPs and only conduct

simple computations such as sortings, calculating inner products, and ratios. Preprocessors should

be a part of the DEA analyst’s toolbox especially when working with large data sets.

Dula & Lopez

References.

[1] Charnes, A., W.W. Cooper, and E. Rhodes, “Measuring the efficiency of decision making units,”

European Journal of Operational Research, Vol. 2, 1978 No. 6, pp. 429-444.

[2] Sueyoshi, T. and Y-L Chang, “Efficient algorithm for additive and multiplicative models in Data

Envelopment Analysis,” Operations Research Letters, Vol. 8, 1989, pp. 205-213.

[3] Sueyoshi, T., “A special algorithm for an additive model in Data Envelopment Analysis,” Journal

of the Operational Research Society, Vol. 3, 1990, pp 249-257.

[4] Ali, A.I., “Streamlined computation for data envelopment analysis,” European Journal of Op-

erational Research, Vol. 64, 1993, pp. 61-67.

[5] Charnes, A., W.W. Cooper, B. Golany, L. Seiford, and J. Stutz, “Foundations of data en-

velopment analysis for Pareto-Koopmans efficient empirical production functions,” Journal of

Econometrics, Vol. 30, 1985, pp. 91–107.

[6] Briec, W., “Holder Distance Function and Measurement of Technical Efficiency,” Journal of

Productivity Analysis, Vol. 11, 1998, pp. 111-131.

[7] Barr, R.S. and M.L. Durchholz, “Parallel and hierarchical decomposition approaches for solving

large-scale Data Envelopment Analysis models,” Annals of Operations Research, Vol. 73, 1997,

pp. 339–372.

[8] Dula, J.H., “A computational study with DEA with massive data sets.” Computers and Oper-

ations Research Res, in print.

[9] Dula, J.H. and F.J. Lopez, “Algorithms for the frame of a finitely generated unbounded poly-

hedron,” INFORMS Journal on Computing Vol 18, 2006, pp. 97–110.

[10] Dula, J.H., R.V. Helgason, B.L. Hickman, “Preprocessing schemes and a solution method for the

convex hull problem in multidimensional space,” Computer Science and Operations Research:

New Developments in Their Interfaces, O. Balci (ed.), pp. 59-70, Pergamon Press, U.K., 1992.

[11] Shaheen, M., Frame of a Pointed Finite Polyhedral Cone, Thesis for Master of Science, Depart-

ment of Economics, Mathematics, and Statistics at the University of Windsor, 2000, Windsor,

Ontario, Canada.

[12] Lopez, F.J. and J.H. Dula, “Adding and removing an attribute in a DEA model: theory and

processing,” under review.


[13] Federal Financial Institutions Examination Council (FFIEC), 2004 Report of Condition and

Income, http://www.chicagofed.org/economic research and data/weekly report of assets and liabilities.cfm.

[14] Rockafellar, R.T., Convex Analysis, Princeton University Press, 1970.

Dula & Lopez

APPENDIX 1

Results and Proofs.

Result 1. Let aj = argmax{‖p − aj‖�; j = 1, . . . , n

}where ‖ · ‖� is the �-norm of the argument.

If aj is unique then it is an extreme point of the convex hull of A.

Proof. Set ‖p−aj‖� = β and define B(p, β) = {z : ‖p−z‖� ≤ β}. B(p, β) is the �-ball centered at p

with “radius” β. Two properties of B(p, β) are relevant: 1) it is convex (see [14], pp.137-138); and

2) the elements of A are in its strict interior except for aj which is on the boundary. Therefore,

the convex hull of A is contained in B(p, β) and there exists a supporting hyperplane for the �-ball

at aj . This hyperplane also supports the convex hull but only at aj . This is enough to conclude

that aj is an extreme point of the convex hull.

Result 2. An inefficient DMU for the VRS production possibility set cannot maximize the 2-norm

when the focal point is pi = min{aji ; j = 1, . . . , n};∀i.

Proof. Any inefficient DMU (including weak efficient), a, can be expressed as a = a + v where a

is expressed as a convex combimation of the extreme points of the VRS production possibility set

and v �= 0 is a direction in the recession cone defined by a positive combination of the directions

−ei; i = 1, . . . , m. Note that pi ≤ ai ≤ ai; i = 1, . . . , m and for some i′, ai′ < ai′ . Therefore

‖a − p‖2 < ‖a − p‖2.


APPENDIX 2

HyperTwist:

A Preprocessor Based on Hyperplane Rotation.

Derivation.

Without loss of generality (see Note 1 below) and to simplify notation we develop this derivation

in terms of the m-th dimension. Consider an extreme-efficient DMU aj∗ ∈ Rm obtained by

maximizing the translation of a hyperplane with normal (parameterized) vector π(γ) =

[π

γ

]; 0 <

π ∈ �m−1; 0 ≤ γ ∈ �. A supporting hyperplane in �m, H(π(γ), β) for the VRS production

possibility set at aj∗is such that

〈π(0), aj∗〉 + γaj∗

m = β (1)

〈π(0), aj〉 + γajm ≤ β; j = 1, . . . , n. (2)

Any rotation of this hyperplane with respect to the mth axis at the point aj∗, such that the

hyperplane remains a support, has the form

〈π(0), aj∗〉 + γ∗aj∗

m = β∗ (3)

〈π(0), aj〉 + γ∗ajm ≤ β∗; j = 1, . . . , n. (4)

where β∗ and γ∗ are controllable parameters (although not necessarily completely free). It follows

from (3) and (4) that

〈π(0), aj〉 + γ∗ajm ≤ 〈π(0), aj∗〉 + γ∗aj∗

m ; j = 1, . . . , n.

Solving for γ∗, when (ajm − aj∗

m) > 0, we obtain:

γ∗ ≤ 〈π(0), aj∗〉 − 〈π(0), aj〉(aj

m − aj∗m)

. (5)

The maximum rotation occurs at a point aj �= aj∗where γ∗ equals the right-hand side in (5), so

long as (5) holds for every point aj ∈ A for which (ajm − aj∗

m) > 0. If there does not exist such

a point, the maximum rotation occurs when the hyperplane supports the polyhedral set at a face

Dula & Lopez

orthogonal to the m-th axis defined by one or more directions of recession. The second parameter,

β∗, is now uniquely specified by (3). The new hyperplane, H(π(γ∗), β∗) supports the production

possibility set at both aj∗and aj .

Notes.

1. Our assumptions about the data mean that the recession of the VRS production possibility set

is always the negative orthant independent of the input/output assignments of the attributes.

For this reason, working in a dimension that corresponds to an input or an output does not

make any difference for the purpose of our development and procedure HyperTwist.

2. If the denominator (ajm − aj∗

m) > 0 in (5) then the numerator is strictly positive, assuring

γ∗ > 0. It cannot be negative because the hyperplane H(π(γ), β) supports the production

possibility set at aj∗by construction. It cannot be zero and satisfy the condition on the

denominator since this would mean that aj belongs to this hyperplane, which is impossible

since here the point with the largest m-th component in this hyperplane, aj , is selected.

3. The point aj will be the one with the largest m-th component and will serve as the “hinge” for

the next twist of the supporting hyperplane. In case of ties, any one of them can be a hinge,

possibly leading to different paths. In procedure HyperTwist we require the identification of

one extreme point to proceed.


APPENDIX 3

Yield of Preprocessors.

Dimension Cardinality Density HyperTran HyperTwist Dominator

5 2500 1 % 3.28 389.45 1808.375 2500 13 % 37.33 1228.28 1447.925 2500 25 % 59.91 449.37 905.955 2500 50 % 69.13 649.09 416.075 5000 1 % 4.00 199.72 1131.965 5000 13 % 4.53 419.41 708.225 5000 25 % 20.14 291.01 462.305 5000 50 % 32.39 299.57 205.185 7500 1 % 0.66 131.81 726.895 7500 13 % 19.48 199.71 472.155 7500 25 % 5.07 196.39 304.715 7500 50 % 6.84 209.70 135.355 10000 1 % 2.53 104.85 546.175 10000 13 % 30.97 125.82 349.835 10000 25 % 20.73 115.55 227.645 10000 50 % 16.21 122.32 100.5310 2500 1 % 4.94 509.29 435.9310 2500 13 % 30.91 307.06 571.0210 2500 25 % 42.21 479.33 413.0110 2500 50 % 29.31 344.51 195.0310 5000 1 % 2.15 172.25 480.7210 5000 13 % 13.60 209.70 412.3810 5000 25 % 17.25 173.09 276.0210 5000 50 % 31.50 196.39 117.9410 7500 1 % 0.96 52.42 253.8010 7500 13 % 4.10 66.15 282.4710 7500 25 % 4.53 71.64 179.8410 7500 50 % 14.39 65.46 62.6910 10000 1 % 0.60 40.85 183.8510 10000 13 % 1.63 50.22 215.2110 10000 25 % 4.17 55.63 109.1710 10000 50 % 17.14 50.64 46.23

Dula & Lopez

Yield of Preprocessors (Cont’d).

Dimension Cardinality Density HyperTran HyperTwist Dominator

15 2500 1 % 4.74 199.72 266.6815 2500 13 % 26.99 275.60 387.6815 2500 25 % 46.02 323.53 312.4615 2500 50 % 17.74 396.93 158.2115 5000 1 % 1.03 43.06 111.5715 5000 13 % 6.27 64.00 225.4515 5000 25 % 7.44 72.40 162.7615 5000 50 % 8.07 68.47 65.7215 7500 1 % 0.81 31.11 82.9515 7500 13 % 8.10 44.53 147.7515 7500 25 % 6.17 45.28 92.6715 7500 50 % 15.98 45.82 40.4815 10000 1 % 0.58 21.30 68.2115 10000 13 % 4.27 35.36 103.5315 10000 25 % 4.05 36.61 63.1115 10000 50 % 3.45 39.03 30.9420 2500 1 % 4.34 142.30 82.2820 2500 13 % 33.59 234.66 214.6120 2500 25 % 31.07 209.70 166.3320 2500 50 % 63.83 316.69 98.8820 5000 1 % 0.89 34.75 59.2520 5000 13 % 12.26 53.44 78.4620 5000 25 % 8.47 56.76 70.4020 5000 50 % 31.64 55.31 39.9520 7500 1 % 0.95 25.85 35.5720 7500 13 % 5.11 33.70 68.5420 7500 25 % 5.29 36.06 47.1520 7500 50 % 3.13 35.45 26.8720 10000 1 % 0.55 18.61 25.6420 10000 13 % 1.41 29.51 46.1020 10000 25 % 1.52 24.48 36.2220 10000 50 % 1.26 32.29 20.42

preprocessing dea.jdula/workingpapers/preprocessingdea.pdf · an adaptation to the dea constant...

Documents