preprocessing dea.jdula/workingpapers/preprocessingdea.pdf · an adaptation to the dea constant...
TRANSCRIPT
Preprocessing DEA.
J.H. Dula1 and F.J. Lopez2
February 2007
Statement of Scope and Purpose. This is a comprehensive study of preprocessing in DEA. The purpose is
to provide tools that will reduce the computational burden of DEA studies especially in large scale applications.
Abstract. We collect, organize, analyze, implement, test, and compare a comprehensive list of ideas for pre-
processors for entity classification in DEA. We limit our focus to procedures that do not involve solving LPs. The
procedures are adaptations from previous work in DEA and in computational geometry. The result is five preprocess-
ing methods three of which are new for DEA. Testing shows that preprocessors have the potential to classify a large
number of DMUs economically making them an important computational tool especially in large scale applications.
Key Words: DEA, DEA computations, linear programming, and computational geometry.
1. Introduction. The paper by Charnes et al. [1] which introduced DEA also offered the first
linear program (LP) formulation for classifying and scoring DMUs. The approach proposed there,
and the standard practice to this day, is to formulate and solve one LP for each entity with the LP’s
size determined by the dimensions of the matrix generated by the full data set. This approach for
efficiency classification and scoring in DEA is computationally intensive. Although much is known
about accelerating the process, solving LPs places heavy computational demands on the process
especially in large scale applications.
A preprocessor in DEA is a procedure that can quickly, efficiently, and conclusively classify
and/or score a DMU without solving an LP. This excludes methods that reduce computational
requirements which somehow involve solving LPs either by accelerating their performance or ex-
tracting from them opportunistic information for classification. Preprocessors are not expected to
1 Corresponding author, School of Business, Virginia Commonwealth University, Richmond, VA 23284, [email protected] College of Business Administration, University of Texas at El Paso, El Paso, TX 79968, [email protected].
Page 1
Page 2 Dula & Lopez
conclusively classify or score all the DMUs in a DEA study. They are intended to either reduce the
total number of LPs that will eventually have to be solved and/or to reduce their size so that they
can be solved faster. Preprocessors have a long tradition in DEA computations which includes the
works by Sueyoshi and Chang [2], Sueyoshi [3], and Ali [4].
In the current work, we collect and analyze the methods that have been proposed for prepro-
cessing in DEA and introduce three new ones. Two of the new ones are adaptations of work in the
related field of computational geometry and are used to identify efficient DMUs. The third prepro-
cessor, HyperTwist, is in a category of preprocessors based on hyperplane rotation for uncovering
efficient DMUs. This category of preprocessors appears here first. We classify, formalize, analyze,
implement, and compare the different preprocessors.
2. The Role of preprocessors in DEA. The elements of a DEA study are: i) a model; i.e.,
a list of inputs and outputs that characterize the process, ii) a data set of the DMUs’ values for
the attributes in the model, and iii) a returns to scale assumption about the transformation pro-
cess. These elements define a production possibility set: the set of all viable inputs and outputs
obtainable from all combinations of the data along with all possibilities from the free disposability
consequences of the returns to scale assumption. The production possibility set is a convex poly-
hedral set with a portion of its boundary constituting the efficient frontier. A DMU is efficient if
and only if it belongs to the efficient frontier of its production possibility set.
One of the main objectives of any DEA study is the classification of entities as efficient or
inefficient. This classification depends only on the three fundamental components above. A DEA
study may also require the calculation of a “score” for each DMU and associated benchmarking
information when these are inefficient. An entity’s score is the objective function value of an LP
and provides information about its relative position with respect to the efficient frontier. Different
LP formulations provide different scores and benchmarks. Although scores are used in practice
to classify DMUs, the solutions may not provide sufficient conditions for classification as in the
case of the relaxed input and output oriented LP formulations. Scoring is not required in all DEA
studies; evidence of this is the use of the familiar additive LP formulation of Charnes et al. [5].
These LPs provide necessary and sufficient conditions for classification but their scores are mostly
useless since they maximize the 1-norm to the efficient frontier (Briec, [6]).
“Preprocessing DEA.” Page 3
All the preprocessors discussed here are used to classify DMUs. With the exception of studies
requiring super-efficiency scores, advanced classification of an efficient DMU with an inexpensive
preprocessor saves having to solve an LP altogether for that entity since it is automatically classified
and scored. Any inefficient DMU that is classified by a preprocessor also obviates an LP solution
if scores are not required. The value of an efficient, effective, and economical preprocessor in these
situations is evident.
Low cost classification of inefficient DMUs can also save work in the event that scores and
benchmarks are required. If a DMU is inefficient, its data point can be omitted from the data
matrix of any LP used for any remaining classifications and scoring. The technique based on this
result is called Reduced Basis Entry (RBE) (Ali [4]). In experimental work by Barr and Durchholz
[7] and Dula [8] it has been shown that RBE can reduce computations in DEA by 50%. Therefore,
the time to solve the smaller LPs that result after employing preprocessors is much less than half.
This suggests that knowledge of inefficient DMUs prior to having to solve any LPs can result in
substantial computational savings especially when relatively many are identified cheaply in large
problems where the proportion of efficient to inefficient DMUs (“density”) is low.
3. Notation and assumptions. A point set consists of n points aj , j = 1, . . . , n each with m
dimensions. The set A collects the data points; i.e., A = {a1, . . . , an}. The ith coordinate of aj is
denoted by aji . DEA points are composed of two parts, as follows:
aj =
[−Xj
Y j
]∈ �m; j = 1, . . . , n;
where 0 �= Xj ≥ 0 and 0 �= Y j ≥ 0 are the input and output data vectors, respectively, for DMU
j. We assume that this set is reduced in the sense that no point is duplicated. We denote by
H(π, β) = {y|〈π, y〉 = β} the hyperplane with normal vector π and level value β, where 〈·, ·〉 is the
inner product of two vectors.
Our development focuses on the VRS production possibility set of DEA. The results are true
for general polyhedral sets and therefore for the other DEA production possibility sets, if not
immediately, with minor adjustments.
Page 4 Dula & Lopez
4. Background and preprocessing principles. Classifying a DMU as efficient or inefficient is
essentially equivalent to identifying boundary and interior points in a finitely generated polyhedral
set; that is, a polyhedron defined by linear combinations of the elements of a point set (Dula and
Lopez [9]). In DEA, the point set is composed of the data for n DMUs each characterized by m
attribute values. The polyhedral set they generate, depending on the returns to scale assumption,
is the production possibility set. General polyhedral sets can have many shapes. They range from
an unbounded polyhedron, as in DEA, to a fully bounded polytope, such as the convex hull of the
point set.
The problem of identifying boundary and/or interior points in finitely generated polyhedral
sets appears in other areas. It is familiar in computational geometry, point depth analysis in
nonparametric multivariate statistics, redundancy in systems of linear inequalities, and stochastic
programming (see Dula and Lopez [9] for more details and references). Ideas for preprocessing
convex hulls to identify extreme points appear in Dula et al. [10]. Many results from these areas
are available to DEA, but more relevant here will be the work on preprocessing for convex hulls.
Previous research on preprocessors for finitely generated polyhedral sets comes mainly from two
sources from two different backgrounds; DEA and computational geometry. Dula et al. [10] pro-
poses preprocessors to identify extreme points – an important class of boundary points, specifically
for convex hulls. The procedures in [10] incorporate a variety of ideas ranging from simple sortings
to calculating Euclidean distances to techniques based on inner products. Sueyoshi and Chang
[2], Sueyoshi [3], and Ali [4] propose preprocessors specifically for DEA. Sueyoshi and Chang [2]
introduce the concept of domination to identify inefficient DMUs. Ali [4] also looks at domination
and proposes simple and effective preprocessors for identifying efficient DMUs based on sorting
and inner products.
Existing preprocessors can be classified as: i) approaches that identify boundary points of
polyhedral sets, of which extreme points are especially important (e.g., efficient and extreme-
efficient DMUs in DEA), and ii) methodologies that identify interior points (e.g., inefficient DMUs
in DEA). In the first category there are three basic ideas:
“Preprocessing DEA.” Page 5
Sorting. The points with the maximum value in each dimension in the VRS DEA model are
extreme-efficient if unique. If not unique, they correspond to DMUs on the boundary, which
means they may or may not be efficient (e.g., weak efficient). These points are identified by
sorting the dimensions. The computational effort for this is minimal and has the potential
of identifying m efficient DMUs in DEA. This idea has been used in Dula et al. [10] in the
context of general convex hulls, and in Ali [4] in the context of DEA.
An adaptation to the DEA constant returns to scale model appears in Shaheen [11]. A
DMU that generates a unique minimum ratio of its components with a projection’s norm is
necessarily an extreme ray of the production possibility set. At most m new CRS extreme-
efficient DMUs can be identified this way.
Norm maximization. Dula et al. [10] propose identifying extreme points of the convex hull
by finding the element in A that maximizes the Euclidean distance to an arbitrary point, p,
in �m. This is a special case of a more general result given next:
Result 1. Let aj = argmax{‖p − aj‖�; j = 1, . . . , n
}where ‖ · ‖� is the �-norm of the argu-
ment. If aj is unique then it is an extreme point of the convex hull of A.
Proof. See Appendix 1.
The procedure in [10] is based on an application of Result 1 for the case of the 2-norm. In this
norm, ties reveal additional extreme points of the convex hull. This will be true of any norm
where support sets for a ball must contain exactly a single point which is not the case of the
1-norm and ∞-norm. Result 1 can be adapted to a DEA VRS production possibility set to
identify extreme-efficient DMUs when the focal point p is judiciously chosen. For example,
for the case of the 2-norm, one such point is the “worst virtual DMU:” pi = min{aji ; j =
1, . . . , n};∀i. Notice that the ∞-norm identifies one of the points revealed through sorting;
namely the one with the overall largest component and, as we will see below, an idea proposed
by Ali in [4] is an application of Result 1 to the 1-norm. Computations for a procedure based
on norm maximization involve calculating and sorting inner products. Result 1 is a new
contribution to computational geometry and DEA.
Page 6 Dula & Lopez
Hyperplane translation. The values of the inner products of the data points in a point set,
A, with some arbitrary vector π �= 0 attain a maximum value β∗ at a point, say aj . The
vector π and the value β∗ define a hyperplane H(π, β∗), which supports the convex hull of
A at aj . If aj is the unique support element, this suffices to conclude that it is an extreme
point. When there is more than one support element, all are boundary points. By restricting
π > 0, the maximum inner product, β∗, of the data points with this positive normal defines
a hyperplane, H(π, β∗), that supports the VRS production possibility set. If the support set
is a singleton then the corresponding DMU is extreme-efficient. If the support set contains
more points, all DMUs participating in the tie will be efficient. Since π > 0, the support set
cannot contain weak efficient DMUs. This idea can be visualized as translating a hyperplane
in the direction of its normal through the polyhedral hull until it reaches the last point.
The computational effort involves inner products and sorting. The idea for convex hulls was
implemented and tested in Dula et al. [10]. Ali’s [4] “output/input aggregation ratio” (Lemma
3, p. 64) is a special case where a specific hyperplane with normal vector π = (1, . . . , 1) is
used. (This is also an application of Result 1 using the 1-norm). The new preprocessor based
on hyperplane translation proposed below specifically for DEA allows the use of multiple
hyperplanes restricted only by the condition π > 0 on their normals.
The second category of preprocessors identifies inefficient DMUs. We are aware of only one
approach in this category.
Domination. DMU j totally dominates DMU j if aj ≥ aj . The dominated DMU is clearly
inefficient. A full implementation of this preprocessor involves at most a quadratic number
of vector subtractions and comparisons. This has the potential of identifying a large number
of inefficient DMUs. The procedure applies to all return to scale assumptions. This idea was
originally proposed by Sueyoshi and Chang [2].
In the next section we implement the preprocessing ideas described above for the case of the
VRS DEA model. In addition, we introduce two new preprocessors for identifying efficient DMUs
in DEA. The first, HyperTran, is an adaptation of the hyperplane translation procedure for convex
hulls from Dula et al. [10]. The second, HyperTwist, falls into an entirely new category based on
“Preprocessing DEA.” Page 7
hyperplane rotation. It results from adapting a procedure introduced in Lopez and Dula [12] to
assess the impact of adding a new attribute to a DEA model.
5. Implementations. The preprocessing principles discussed in Section 4 can be the basis
for a variety of methods for preprocessing in DEA. Methods can be deterministic or probabilistic,
parameter-dependent or parameter-free, etc. In the spirit of consistency we propose methods based
on the following guidelines: i) they are deterministic; ii) they do not depend on defining parameters
that require experimental tuning ; and iii) the number of computations are bounded. The methods
produced under these guidelines are not affected by convergence issues or other stopping criteria.
This provides consistency across the different preprocessing principles and facilitates comparisons.
We design five methods specifically for VRS DEA. The first two, DimensionSort and Dominator,
have been used in DEA before. The other three preprocessors, MaxEuclid, HyperTran, and
HyperTwist, are new to DEA.
The DMUs in the data set are classified as follows:
E is the set of efficient DMUs.
E∗ is the set of extreme-efficient DMUs. Note E∗ ⊆ E .
I is the set of inefficient DMUs.
Each method is formally presented below. The pseudo-codes are followed by a discussion.
Page 8 Dula & Lopez
Sorting Method DimensionSort
Procedure: DimensionSort
[INPUT:] A.
[OUTPUT:] E∗ ⊆ E∗ ⊆ A.Initialization. E∗ ← ∅.
For i = 1 to m, Do:
j∗ = argmaxj
aji ; j = 1, . . . , n.
If j∗ is unique, Then,
E∗ ← E∗ ∪ {aj∗}.Else, Resolve Ties
(Resolve ties by applying this procedure
recursively to the points involved in the
ties on the remaining dimensions).
End if.
Next i.
Finalization. E∗ contains extreme-efficient DMUs.
Notes on Procedure DimensionSort.
1. Procedure DimensionSort identifies extreme-efficient DMUs. Each DMU that emerges from
a pass will have at least one component which is the largest in its dimension. This is sufficient
to conclude that the corresponding point is extreme in the production possibility set. This
is guaranteed by the tie resolution procedure and by the no duplication assumption. The
number of different extreme-efficient DMUs that this method identifies is at most m.
2. Each of the m iterations requires sorting n values. Note that tie resolution can be invoked as
many as m − 1 times within each iteration with possibly as many as n points participating
in each tie.
“Preprocessing DEA.” Page 9
Domination Method: Dominator.
Dominator
[INPUT:] A.
[OUTPUT:] I ⊆ I ⊆ A.Initialization. I ← ∅.
For j = 1 to n, Do:
For k = 1 to n, k �= j, Do:
If ak is still unclassified,
If aj ≥ ak,
I ← I ∪ {ak}.End if.
End if.
Next k.
Next j.
Finalization. I contains inefficient DMUs.
Notes on Procedure Dominator.
1. Procedure Dominator identifies exclusively inefficient DMUs including weak efficient. Dominator
has the potential to identify a large subset of the inefficient DMUs.
2. The implementation is enhanced by omitting from subsequent comparisons points that are
identified as inefficient. This means we can expect to handle fewer and fewer points as the
procedure progresses.
3. Decomposition Schemes. Points in the interior of polyhedral sets defined by subsets of the
data are also interior to the polyhedron generated by the complete point set. An effective
preprocessor can be based on partitioning blocks to identify inefficient DMUs and repeating
this with new blocks composed of entities with unknown status until a final single block
is processed. This decomposition approach can be designed using Dominator to identify
inefficient DMUs within blocks. Since an implementation requires a decision about the size of
the initial and intermediary blocks, it will involve experimental tuning disqualifying it from
further consideration in this article.
Page 10 Dula & Lopez
4. The procedure requires at most (n − 1)2 comparisons. The enhancement may reduce this
substantially.
Euclidean Distance Method: MaxEuclid
Procedure: MaxEuclid
[INPUT:] A.
[OUTPUT:] E∗ ⊆ E∗ ⊆ A.Initialization. E∗ ← ∅.
For i = 1 to m, Do:
pi = minj
aji.
Next i.
j∗ = argmaxj
〈aj − p, aj − p〉; j = 1, . . . , n.
E∗ ← E∗ ∪ {aj∗}.
Finalization. E∗ contains extreme-efficient DMUs.
Notes on Procedure MaxEuclid.
1. Procedure MaxEuclid applies a special case of Result 1 above. Although other norms may
reveal different extreme-efficient DMUs, we anticipate that the same maximizer would emerge
in many of these.
2. Only extreme points maximize the Euclidean distance from a properly selected focal point.
Although the convex hull of the data when the focal point, p, is included may generate
extreme points that are not extreme-efficient DMUs, Result 2 in Appendix 1 demonstrates
how any such points cannot maximize the 2-norm. For this reason, Procedure MaxEuclid
only uncovers extreme-efficient DMUs and, in case of ties, all points are extreme-efficient
although not necessarily on a common face. As many as all the extreme-efficient DMUs
could be identified by MaxEuclid if they are located on the boundary of an m-dimensional
“Preprocessing DEA.” Page 11
hypersphere when the focal point p is at the center. More realistically, MaxEuclid will identify
one extreme-efficient DMU and more than that is unlikely.
3. Procedure MaxEuclid requires the calculation of a focal point. The point used by the proce-
dure, p, can be interpreted as a “worst virtual DMU”. Other focal points are possible. For
example, focal points can be located such that the point set belongs to the positive orthant
that the focal point determines. Some sort of parameter needs to be defined to generate a
sequence of useful focal points that are sufficiently separated from each other so as to result
in the identification of different extreme points. This parametric dependence violates our
guidelines.
4. This implementation of the 2-norm for identifying extreme-efficient DMUs requires calculating
and sorting n inner products. Note that the ordering of values obtained by the 2-norm is the
same as that of their squares.
Translating Hyperplanes Method: HyperTran.
Procedure: HyperTran
[INPUT:] A.
[OUTPUT:] E ⊆ E ⊆ A.Initialization. E ← ∅.
For i = 1 to m, Do:
pi = minj
(aji − ε).
Next i.
For j = 1 to n, Do:
πj = aj − p
j∗ = argmaxk
〈πj , ak〉; k = 1, . . . , n.
E ← E ∪ {aj∗}.Next j.
Finalization. E contains efficient DMUs.
Page 12 Dula & Lopez
Notes on Procedure HyperTran.
1. Procedure HyperTran identifies efficient DMUs by translating hyperplanes until they become
supports for the production possibility set. Efficiency is assured by the fact that the normals,
πj , of the supporting hyperplanes are strictly positive. The procedure is not confounded
by weak efficiency. Every hyperplane translation will identify at least one extreme-efficient
DMU; in case of ties, all are efficient.
2. The point p used in the procedure is, again, the “worst virtual DMU” except it undergoes a
slight perturbation to assure that πj > 0. As with MaxEuclid other focal points are possible.
HyperTran has the potential to identify many of the efficient DMUs since every data point
generates a translating hyperplane.
3. The procedure requires n2 inner products and n sortings, one for each πj defined.
Rotating Hyperplanes Method: HyperTwist.
We use the following terms: u = (1, . . . , 1) ∈ �m and the �-th unit vector in �m is e�, � =
1, . . . , m.
“Preprocessing DEA.” Page 13
HyperTwist
[INPUT:] A.
[OUTPUT:] E ⊆ E ⊆ A.Global Initialization. E ← ∅.
For � = 1 to m, Do:
Step 0. (Local Initialization)k = 0; π� = u − e�;j∗0 = argmax
j〈π�, aj〉; j = 1, . . . , n.
Select j∗0 such that aj∗0
� is max,
E ← E ∪ {aj∗0 }.
Step 1. (Pivot)i. k = k + 1.ii. For j = 1, . . . , n, Do:
γj =
⎧⎪⎪⎨⎪⎪⎩
〈π�,aj∗k−1 〉−〈π�,aj〉
(aj�−a
j∗k−1
� ), if (aj
� − aj∗
k−1
� ) > 0;
M (large number), otherwise.
Next j.
iii. γ∗ = minj
γj.
iv. If γ∗ = M, STOP. Otherwise go to Step 2.
Step 2. (Find RHS)
β = 〈π�, aj∗k−1〉 + γ∗a
j∗k−1
� .
Step 3.
i. Define J ={j|〈π�, aj〉 + γ∗aj
� = β}.
ii. Select j∗ such that aj∗
� is max for all j ∈ J.
E ← E ∪ {aj∗},Go to Step 1.
Next �.
Page 14 Dula & Lopez
Notes on Procedure HyperTwist.
1. The derivation of the results that make this procedure work has been relegated to an appendix.
2. The procedure generates a sequence of supporting hyperplanes changing their orientation
as they visit extreme points of the production possibility set. Each change in orientation
corresponds to a pivot operation. One pass of Procedure HyperTwist generates a sequence of
supporting hyperplanes that partially wrap the production possibility set along the selected
dimension, �. The procedure performs m passes, one for each dimension. The hyperplanes
begin parallel to the �-th dimension and end orthogonal to it. In between, the hyperplanes
twist and turn as if hinged at extreme points progressively higher in the �-th dimension.
3. We can see how HyperTwist works in the example depicted in Figure 1. The figure shows the
sequence of rotating hyperplanes in one of the passes of the procedure. Here, “Output 2” is
the selected �th dimension of a three-dimensional VRS production possibility set. Each pivot
takes place at a different extreme point. In this example, we see how three extreme-efficient
DMUs are detected by the procedure.
4. The classification of aj∗0 as extreme-efficient in the local initializations is a result of the same
principles that apply for translating hyperplane methods. In the event of ties, other extreme
points among them can also be classified as efficient. If additional ties occur for the maximum
�th component, then these are all efficient.
5. In Steps 0 and 3ii, j∗ must be selected such that aj∗
� is max. If ties persist, it is convenient
to choose an extreme point among them to serve as the new pivot point. This can be done
expeditiously by identifying extreme values of the coordinates of the points in the tie.
6. In Step 3, J is the index set of all the data points on the same current supporting hyperplane
defined by normal π� +γ∗e� and level value β. If the cardinality is two, then both are extreme
points of the production possibility set and hence correspond to VRS extreme-efficient DMUs.
If the cardinality is more than two and 0 < γ∗ < M , then all the points involved in the tie
are VRS efficient.
“Preprocessing DEA.” Page 15
Input
Output 1
Output 2
Input
Output 2
Initialization: First supporting hyperplaneis parallel to reference dimension: Output 2.
Output 2
Input
Output 1
Iteration 1: Supporting hyperplane after first pivot. Support set is an edge between two extreme points.
Output 2
Input
Output 2
Last iteration: Final pivot; hyperplane almost orthogonal to reference dimension.
Input
Output 1
Output 2
Input
Iteration 2: Supporting hyperplane after second pivot. Edge reveals third extreme point.
Output 2
Output 1
Input
Figure 1. One pass of HyperTwist and the sequence of rotating supporting hyperplanes.
7. Each pass of HyperTwist will identify at least one extreme-efficient DMU in its local initial-
ization. There is not, however, any guarantee that any more will be identified; the first pivot
could be the last.
8. Computational requirements for HyperTwist involve only inner products, ratios, and sortings.
Within each pass, the number of inner products per pivot is at most n although it can be
expected to decrease sharply as the pivots progress. The maximum number of pivots in a
pass is n. HyperTwist is another procedure with potential to identify many efficient DMUs.
6. Computational results. We tested the five procedures for preprocessing DEA to investigate
their performance and how this is affected by the three most important data characteristics: car-
dinality (number of DMUs), dimension (total number of attributes), and density (proportion of
efficient DMUs). We generated synthetic point sets with cardinalities 2500, 5000, 7500, and 10000
DMUs; in 5, 10, 15, and 20 dimensions; and with densities of 1%, 13%, and 25%. Note that the
largest of these files can be considered large scale problems. The combination of four cardinalities,
Page 16 Dula & Lopez
four dimensions, and three densities results in 48 data files. This synthetic problem suite allows
us to control for the important DEA characteristics to obtain useful conclusions. To investigate
the performance of the procedures on real data, we applied them to a data set from the Federal
Financial Institutions Examination Council [13], which contains yearly data about commercial
banks.
The synthetic point sets were generated as follows. First, the efficient DMUs were generated as
elements of the boundary of a hypersphere in the orthant defined by the input/output mix. The
inefficient DMUs were generated by taking points on the boundary of the sphere and contracting
them, radially, using a randomly generated factor from a triangular distribution. The idea of
using this distribution was to make it more likely that the interior points would be close to the
boundary of the production possibility set. Next, each dimension (attribute) was scaled using a
factor randomly selected from a uniform distribution between 1 and 1000. This change does not
affect the density of the point set but makes the data more realistic by making it less symmetric.
The procedures were coded in Fortran. The experiments were performed on a dedicated Pentium
4 PC running at 2.66 GHz with 512 MB of RAM.
Making comparisons between preprocessors based exclusively on number of classifications may
be misleading since the corresponding cpu times tend to be quite different. We propose using as
a common measure for comparisons the yield of the procedures, defined here as the number of
classifications made per tenth of a cpu second. A classification is the identification of an inefficient
DMU in the case of Dominator or an efficient DMU in all other cases. The yields reported below
are the average of three runs. These results appear in Appendix 3.
The computational effort required by DimensionSort and MaxEuclid was hardly measurable
for most of the problems and their contribution is limited. DimensionSort identified m extreme-
efficient DMUs almost all the time (in a few cases it found fewer than m). MaxEuclid always
identified exactly one extreme-efficient DMU. In both cases the clock did not record any usable
cpu time and therefore no yields were calculated. These procedures essentially provide free classi-
fications.
“Preprocessing DEA.” Page 17
For the remaining three implementations, it is useful as a baseline reference to report on results
obtained in classifying DEA efficiency and inefficiency using LPs. We processed selected point sets
with an LP formulation that starts with the full data set and applies the restricted basis entry
(RBE) enhancement described in [4], [7], [8]. (The yields reported in Table 1 for traditional DEA
studies using LPs are about twice those of unenhanced “naive” implementations ([7], [8])).
Table 1. Yield of enhanced traditional LP approach for selected problems.
Dimension Cardinality Density Yield
5 2500 1 % 19.518410 5000 13 % 2.543615 7500 13 % 0.385720 10000 25 % 0.1602
The next three preprocessors were dramatically more effective in classifying DMUs than the pre-
vious two. Even though in general it is difficult to find predictable effects given that preprocessors
are vulnerable to data peculiarities, in some instances it is possible to identify general patterns.
We analyze next these preprocessing implementations.
Our implementation of Dominator confirms what Sueyoshi and Chang [2] observed: it is a
powerful preprocessor with the potential to classify a large number of inefficient DMUs at a low
cost. Sueyoshi and Chang’s initial implementations resulted in the identification of 100% of the
inefficient DMUs. They proceeded to modify their problem generator to avoid this condition.
Our inefficient DMUs (points) were generated with this issue in mind, hence the reason for the
triangular distribution for the contraction factor in their generation. Even so, an average of 78.43%
of the inefficient DMUs were totally dominated and thus identified by Dominator.
Figure 2 illustrates representative cases of the behavior of the yield of Dominator controlling
cardinality, dimension, and density. The effect of increased cardinality is a clear decrease in the
yield of this procedure. This is to be expected. Even though the number of classifications can
be expected to increase close to linearly given the assumption that density remains the same, the
number of comparisons is almost quadratic in the number of DMUs. The impact of dimension is
also predictable. Detecting whether a DMU dominates another requires comparing all coordinates,
Page 18 Dula & Lopez
2500 5000 7500 100000
100
200
300
400
500
Yield Dimension 10; Density 25%
Cardinality
05 10 15 200
200
400
600
800Cardinality 7500; Density 1%
Dimension
01 13 250
20
40
60
80
100
120
140
160Cardinality 7500; Dimension 15
Density
Yield Yield
Figure 2. Yield of Dominator.
causing computational effort to increase with the number of dimensions without additional clas-
sifications, which affects adversely the yield. The results of our experiments make it difficult to
understand the impact of density on yield. Dominator’s yield decreased as density increased from
1% to 13% to 25% when there were five dimensions. With the other dimensions, the yield increases
practically always as density goes from 1% to 13% but then decreases when density changes from
13% to 25%. Lower yields with 1% density than with 13% squares with the expectation of a greater
probability, in the latter case, of finding a dominating DMU for each dominated one during the
procedure. One might also expect, however the following effects to start to prevail and reduce
yields as density increases: 1) more DMUs are efficient and therefore undominated; and 2) there
is an erosion of the effectiveness of the enhancement since fewer dominated DMUs are removed
from the analysis as the procedure progresses. The effect is made dramatic in the limit since a
density of 100% would result in a zero yield. This suggests a relation with density where yields
initially increase due to the impact of extra efficient DMUs that dominate others but eventually
starts to decrease due to the effect of fewer available dominated entities and less advantage from
the enhancement.
The results of the implementation of HyperTran are illustrated using Figure 3. These graphs
were selected to represent what was typically observed. HyperTran also seems to respond to
cardinality and dimension more predictably than density. Increases in cardinality generate more
work for the procedure without necessarily increasing the number of DMUs classified. This means
we can expect yields to be adversely affected by this attribute and this is confirmed in our tests.
The apparent adverse effect of increases in dimension on the yield can be explained by the increase
in the amount of work in the calculation of inner products. These experiments do not allow any
useful determination about the impact of density on HyperTran’s yield. The procedure may be
“Preprocessing DEA.” Page 19
2500 5000 7500 100000
5
10
15
20
25
30
Yield Dimension 15; Density 13%
Cardinality
05 10 15 200
5
10
15
20
25Cardinality 10000; Density 25%
Dimension
01 13 250
5
10
15
20
25Cardinality 7500; Dimension 5
Density
Yield Yield
Figure 3. Yield of HyperTran.
sensitive to the geometry and scaling of the data. Translating hyperplanes would tend to end up at
points with the more extreme magnitude values in dimensions were the attribute units are large.
Also, extreme points may have point clouds nearby that would tend to attract a disproportionate
number of hyperplanes to themselves.
Figure 4 depicts three typical situations in our experience with HyperTwist. We can see that
increasing cardinality tends to decrease yield. The reason would be the same as with HyperTran;
that is, the number of inner products grows with the number of DMUs without necessarily a
proportional identification of efficient points. The procedure’s yield is also adversely affected by
dimension. Increasing the dimension increases the number of passes through the main loop of
the procedure and the inner products have more components. The yield of HyperTwist when
density increases tends to increase slightly or remains more or less constant. This may sound
counter intuitive since one would expect HyperTwist to encounter more extreme points in a denser
environment, which should result in noticeable yield improvements. Higher density may result in
more classifications but this is counteracted by the increase in pivots from the additional extreme
points.
HyperTran and HyperTwist are the two most complex procedures studied here and have sim-
ilar functionality – the identification of efficient DMUs. Both procedures are based on the same
supporting hyperplane principle that their support set is composed of boundary points. Because
of this, it is appropriate to compare the two.
Hyperplane Translation and HyperTwist. A comparison between HyperTran and HyperTwist
is illustrated in Figure 5. As noted above, the yield of both procedures is adversely impacted by
cardinality and dimension. Higher densities have a slight positive effect on HyperTwist but its
Page 20 Dula & Lopez
2500 5000 7500 100000
100
200
300
400
500
600
Yield Dimension 10; Density 25%
Cardinality
05 10 15 200
20
40
60
80
100
120
140Cardinality 10000; Density 25%
Dimension
01 13 250
20
40
60
80Cardinality 7500; Dimension 10
Density
Yield Yield
Figure 4. Yield of HyperTwist.
effect on HyperTran is not as clear. As shown in Figure 5 and what is true in general is that
HyperTwist is a more efficient preprocessor, with yields frequently one order of magnitude greater
or better than those of HyperTran.
We finish this section reporting the performance of Dominator, HyperTran, and HyperTwist on
data from the Federal Financial Institutions Examination Council [13]. Using these data we built
three problems. The first one contains 4,971 DMUs, three inputs, and four outputs; the second has
12,456 DMUs, five inputs, and three outputs; and the third includes 19,939 DMUs, six inputs, and
five outputs. The yield of the procedures along that of the LP approach, for contrast, is reported
in Table 2:
Table 2. Preprocessors (and LPs) Yield on bank data.
Problem Card Dim Dominator HyperTran HyperTwist LPs
1 4971 7 612.984 0.379 149.790 8.0752 12456 8 91.034 0.014 21.968 0.5603 19939 11 0.848 0.070 15.272 0.027
These data display the negative effects on yields of increases in cardinality and dimension ob-
served in the synthetic data sets. HyperTwist continues to outperform HyperTran on these real-
world data.
It is important to remember that these preprocessors are also useful when dealing with polyhe-
dral sets different from the DEA VRS production possibility set.
“Preprocessing DEA.” Page 21
2500 5000 7500 100000
50
100
150
200
250
YieldDimension 20; Density 13%
Cardinality
05 10 15 200
20
40
60
80
100
120Cardinality 10000; Density 1%
Dimension
01 13 250
10
20
30
40
50Cardinality 7500; Dimension 15
Density
Yield Yield
HyperTwist
HyperTran
HyperTwist
HyperTran
HyperTwist
HyperTran
Figure 5. Comparison between HyperTran and HyperTwist.
7. Concluding remarks. Preprocessors are an important aspect in the development of compu-
tational tools in many areas of OR/MS, especially when speed is critical in large scale applications.
Today, these types of approaches speed-up linear programming, integer programming, and count-
less specialized procedures developed for optimization and combinatorics, to make them better
able to cope with bigger and more complex problems.
The contributions of this paper are to collect, organize, analyze, implement, test, and compare
different preprocessors to classify DEA data points as efficient and inefficient. We introduce three
new preprocessors to DEA: one based on norm maximization; another on hyperplane translation,
HyperTran; and the third on hyperplane rotation, HyperTwist. We designed a total of five pre-
processing methods for the DEA variable returns to scale model. Computational results show
that the preprocessor to identify inefficient DMUs based on testing for domination, Dominator, is
highly effective. Two other preprocessors, HyperTwist and HyperTran, both based on the principle
that supporting hyperplanes identify efficient entities, produce excellent results with HyperTwist
consistently the better of the pair.
Testing compared yields, defined to be the number of DMUs classified as efficient or inefficient
per cpu time unit (tenth of a second). Testing shows that the yield of preprocessors usually
decreases when cardinality or dimension increases. The impact of changes on density, defined as
the percentage of points that are efficient, is not as clear, but it appears that yield tends to decrease
as density increases with Dominator and has little impact on HyperTwist. It is clear, though, that
the computational cost of identifying efficient or inefficient DMUs with preprocessors is cheap.
The effectiveness of preprocessors stems from the fact that they do not solve LPs and only conduct
simple computations such as sortings, calculating inner products, and ratios. Preprocessors should
be a part of the DEA analyst’s toolbox especially when working with large data sets.
Page 22 Dula & Lopez
References.
[1] Charnes, A., W.W. Cooper, and E. Rhodes, “Measuring the efficiency of decision making units,”
European Journal of Operational Research, Vol. 2, 1978 No. 6, pp. 429-444.
[2] Sueyoshi, T. and Y-L Chang, “Efficient algorithm for additive and multiplicative models in Data
Envelopment Analysis,” Operations Research Letters, Vol. 8, 1989, pp. 205-213.
[3] Sueyoshi, T., “A special algorithm for an additive model in Data Envelopment Analysis,” Journal
of the Operational Research Society, Vol. 3, 1990, pp 249-257.
[4] Ali, A.I., “Streamlined computation for data envelopment analysis,” European Journal of Op-
erational Research, Vol. 64, 1993, pp. 61-67.
[5] Charnes, A., W.W. Cooper, B. Golany, L. Seiford, and J. Stutz, “Foundations of data en-
velopment analysis for Pareto-Koopmans efficient empirical production functions,” Journal of
Econometrics, Vol. 30, 1985, pp. 91–107.
[6] Briec, W., “Holder Distance Function and Measurement of Technical Efficiency,” Journal of
Productivity Analysis, Vol. 11, 1998, pp. 111-131.
[7] Barr, R.S. and M.L. Durchholz, “Parallel and hierarchical decomposition approaches for solving
large-scale Data Envelopment Analysis models,” Annals of Operations Research, Vol. 73, 1997,
pp. 339–372.
[8] Dula, J.H., “A computational study with DEA with massive data sets.” Computers and Oper-
ations Research Res, in print.
[9] Dula, J.H. and F.J. Lopez, “Algorithms for the frame of a finitely generated unbounded poly-
hedron,” INFORMS Journal on Computing Vol 18, 2006, pp. 97–110.
[10] Dula, J.H., R.V. Helgason, B.L. Hickman, “Preprocessing schemes and a solution method for the
convex hull problem in multidimensional space,” Computer Science and Operations Research:
New Developments in Their Interfaces, O. Balci (ed.), pp. 59-70, Pergamon Press, U.K., 1992.
[11] Shaheen, M., Frame of a Pointed Finite Polyhedral Cone, Thesis for Master of Science, Depart-
ment of Economics, Mathematics, and Statistics at the University of Windsor, 2000, Windsor,
Ontario, Canada.
[12] Lopez, F.J. and J.H. Dula, “Adding and removing an attribute in a DEA model: theory and
processing,” under review.
“Preprocessing DEA.” Page 23
[13] Federal Financial Institutions Examination Council (FFIEC), 2004 Report of Condition and
Income, http://www.chicagofed.org/economic research and data/weekly report of assets and liabilities.cfm.
[14] Rockafellar, R.T., Convex Analysis, Princeton University Press, 1970.
Page 24 Dula & Lopez
APPENDIX 1
Results and Proofs.
Result 1. Let aj = argmax{‖p − aj‖�; j = 1, . . . , n
}where ‖ · ‖� is the �-norm of the argument.
If aj is unique then it is an extreme point of the convex hull of A.
Proof. Set ‖p−aj‖� = β and define B(p, β) = {z : ‖p−z‖� ≤ β}. B(p, β) is the �-ball centered at p
with “radius” β. Two properties of B(p, β) are relevant: 1) it is convex (see [14], pp.137-138); and
2) the elements of A are in its strict interior except for aj which is on the boundary. Therefore,
the convex hull of A is contained in B(p, β) and there exists a supporting hyperplane for the �-ball
at aj . This hyperplane also supports the convex hull but only at aj . This is enough to conclude
that aj is an extreme point of the convex hull.
Result 2. An inefficient DMU for the VRS production possibility set cannot maximize the 2-norm
when the focal point is pi = min{aji ; j = 1, . . . , n};∀i.
Proof. Any inefficient DMU (including weak efficient), a, can be expressed as a = a + v where a
is expressed as a convex combimation of the extreme points of the VRS production possibility set
and v �= 0 is a direction in the recession cone defined by a positive combination of the directions
−ei; i = 1, . . . , m. Note that pi ≤ ai ≤ ai; i = 1, . . . , m and for some i′, ai′ < ai′ . Therefore
‖a − p‖2 < ‖a − p‖2.
“Preprocessing DEA.” Page 25
APPENDIX 2
HyperTwist:
A Preprocessor Based on Hyperplane Rotation.
Derivation.
Without loss of generality (see Note 1 below) and to simplify notation we develop this derivation
in terms of the m-th dimension. Consider an extreme-efficient DMU aj∗ ∈ Rm obtained by
maximizing the translation of a hyperplane with normal (parameterized) vector π(γ) =
[π
γ
]; 0 <
π ∈ �m−1; 0 ≤ γ ∈ �. A supporting hyperplane in �m, H(π(γ), β) for the VRS production
possibility set at aj∗is such that
〈π(0), aj∗〉 + γaj∗
m = β (1)
〈π(0), aj〉 + γajm ≤ β; j = 1, . . . , n. (2)
Any rotation of this hyperplane with respect to the mth axis at the point aj∗, such that the
hyperplane remains a support, has the form
〈π(0), aj∗〉 + γ∗aj∗
m = β∗ (3)
〈π(0), aj〉 + γ∗ajm ≤ β∗; j = 1, . . . , n. (4)
where β∗ and γ∗ are controllable parameters (although not necessarily completely free). It follows
from (3) and (4) that
〈π(0), aj〉 + γ∗ajm ≤ 〈π(0), aj∗〉 + γ∗aj∗
m ; j = 1, . . . , n.
Solving for γ∗, when (ajm − aj∗
m) > 0, we obtain:
γ∗ ≤ 〈π(0), aj∗〉 − 〈π(0), aj〉(aj
m − aj∗m)
. (5)
The maximum rotation occurs at a point aj �= aj∗where γ∗ equals the right-hand side in (5), so
long as (5) holds for every point aj ∈ A for which (ajm − aj∗
m) > 0. If there does not exist such
a point, the maximum rotation occurs when the hyperplane supports the polyhedral set at a face
Page 26 Dula & Lopez
orthogonal to the m-th axis defined by one or more directions of recession. The second parameter,
β∗, is now uniquely specified by (3). The new hyperplane, H(π(γ∗), β∗) supports the production
possibility set at both aj∗and aj .
Notes.
1. Our assumptions about the data mean that the recession of the VRS production possibility set
is always the negative orthant independent of the input/output assignments of the attributes.
For this reason, working in a dimension that corresponds to an input or an output does not
make any difference for the purpose of our development and procedure HyperTwist.
2. If the denominator (ajm − aj∗
m) > 0 in (5) then the numerator is strictly positive, assuring
γ∗ > 0. It cannot be negative because the hyperplane H(π(γ), β) supports the production
possibility set at aj∗by construction. It cannot be zero and satisfy the condition on the
denominator since this would mean that aj belongs to this hyperplane, which is impossible
since here the point with the largest m-th component in this hyperplane, aj , is selected.
3. The point aj will be the one with the largest m-th component and will serve as the “hinge” for
the next twist of the supporting hyperplane. In case of ties, any one of them can be a hinge,
possibly leading to different paths. In procedure HyperTwist we require the identification of
one extreme point to proceed.
“Preprocessing DEA.” Page 27
APPENDIX 3
Yield of Preprocessors.
Dimension Cardinality Density HyperTran HyperTwist Dominator
5 2500 1 % 3.28 389.45 1808.375 2500 13 % 37.33 1228.28 1447.925 2500 25 % 59.91 449.37 905.955 2500 50 % 69.13 649.09 416.075 5000 1 % 4.00 199.72 1131.965 5000 13 % 4.53 419.41 708.225 5000 25 % 20.14 291.01 462.305 5000 50 % 32.39 299.57 205.185 7500 1 % 0.66 131.81 726.895 7500 13 % 19.48 199.71 472.155 7500 25 % 5.07 196.39 304.715 7500 50 % 6.84 209.70 135.355 10000 1 % 2.53 104.85 546.175 10000 13 % 30.97 125.82 349.835 10000 25 % 20.73 115.55 227.645 10000 50 % 16.21 122.32 100.5310 2500 1 % 4.94 509.29 435.9310 2500 13 % 30.91 307.06 571.0210 2500 25 % 42.21 479.33 413.0110 2500 50 % 29.31 344.51 195.0310 5000 1 % 2.15 172.25 480.7210 5000 13 % 13.60 209.70 412.3810 5000 25 % 17.25 173.09 276.0210 5000 50 % 31.50 196.39 117.9410 7500 1 % 0.96 52.42 253.8010 7500 13 % 4.10 66.15 282.4710 7500 25 % 4.53 71.64 179.8410 7500 50 % 14.39 65.46 62.6910 10000 1 % 0.60 40.85 183.8510 10000 13 % 1.63 50.22 215.2110 10000 25 % 4.17 55.63 109.1710 10000 50 % 17.14 50.64 46.23
Page 28 Dula & Lopez
Yield of Preprocessors (Cont’d).
Dimension Cardinality Density HyperTran HyperTwist Dominator
15 2500 1 % 4.74 199.72 266.6815 2500 13 % 26.99 275.60 387.6815 2500 25 % 46.02 323.53 312.4615 2500 50 % 17.74 396.93 158.2115 5000 1 % 1.03 43.06 111.5715 5000 13 % 6.27 64.00 225.4515 5000 25 % 7.44 72.40 162.7615 5000 50 % 8.07 68.47 65.7215 7500 1 % 0.81 31.11 82.9515 7500 13 % 8.10 44.53 147.7515 7500 25 % 6.17 45.28 92.6715 7500 50 % 15.98 45.82 40.4815 10000 1 % 0.58 21.30 68.2115 10000 13 % 4.27 35.36 103.5315 10000 25 % 4.05 36.61 63.1115 10000 50 % 3.45 39.03 30.9420 2500 1 % 4.34 142.30 82.2820 2500 13 % 33.59 234.66 214.6120 2500 25 % 31.07 209.70 166.3320 2500 50 % 63.83 316.69 98.8820 5000 1 % 0.89 34.75 59.2520 5000 13 % 12.26 53.44 78.4620 5000 25 % 8.47 56.76 70.4020 5000 50 % 31.64 55.31 39.9520 7500 1 % 0.95 25.85 35.5720 7500 13 % 5.11 33.70 68.5420 7500 25 % 5.29 36.06 47.1520 7500 50 % 3.13 35.45 26.8720 10000 1 % 0.55 18.61 25.6420 10000 13 % 1.41 29.51 46.1020 10000 25 % 1.52 24.48 36.2220 10000 50 % 1.26 32.29 20.42