multi-phase dynamic constraint aggregation for set partitioning type problems

Math. Program., Ser. A (2010) 123:345–370DOI 10.1007/s10107-008-0254-5

FULL LENGTH PAPER

Multi-phase dynamic constraint aggregation for setpartitioning type problems

Issmail Elhallaoui · Abdelmoutalib Metrane ·François Soumis · Guy Desaulniers

Received: 14 December 2005 / Accepted: 7 October 2008 / Published online: 14 November 2008© Springer-Verlag 2008

Abstract Dynamic constraint aggregation is an iterative method that was recentlyintroduced to speed up the linear relaxation solution process of set partitioning typeproblems. This speed up is mostly due to the use, at each iteration, of an aggregatedproblem defined by aggregating disjoint subsets of constraints from the set partitioningmodel. This aggregation is updated when needed to ensure the exactness of the overallapproach. In this paper, we propose a new version of this method, called the multi-phasedynamic constraint aggregation method, which essentially adds to the original methoda partial pricing strategy that involves multiple phases. This strategy helps keepingthe size of the aggregated problem as small as possible, yielding a faster averagecomputation time per iteration and fewer iterations. We also establish theoretical resultsthat provide some insights explaining the success of the proposed method. Tests onthe linear relaxation of simultaneous bus and driver scheduling problems involving upto 2,000 set partitioning constraints show that the partial pricing strategy speeds upthe original method by an average factor of 4.5.

Keywords Dynamic constraint aggregation · Set partitioning · Degeneracy ·Multi-phase algorithm · k-incompatibility

Mathematics Subject Classification (2000) 90C05

I. Elhallaoui · A. Metrane · F. Soumis · G. Desaulniers (B)Département de Mathématiques et de Génie Industriel, École Polytechnique and GERAD,C.P. 6079, Succ. Centre-Ville, Montreal, QC H3C 3A7, Canadae-mail: [email protected]

A. Metranee-mail: [email protected]

F. Soumise-mail: [email protected]

G. Desaulnierse-mail: [email protected]

123

346 I. Elhallaoui et al.

1 Introduction

Given a set W , a set S of admissible non-empty subsets of W , and a cost cs for eachsubset s ∈ S, the set partitioning problem consists of selecting subsets in S suchthat each element in W belongs to exactly one of the selected subsets and the sumof the costs of these subsets is minimized. This problem is one of the fundamentalmodels in operations research, in particular in vehicle and crew scheduling. In fact,several problems in this field consist of covering at minimum cost a set of tasks exactlyonce with feasible paths. A task may be a flight leg, a bus trip or a customer to visit,while a path may represent an aircraft rotation, a bus driver schedule or a vehicleroute. Side constraints such as vehicle availability can also be considered, yieldingset partitioning type problems. A partial list of applications for these models includestruck deliveries, vehicle routing, bus driver scheduling, airline crew scheduling, andsimultaneous locomotive and car assignment (see [2,5,7,9,13,19]).

Set partitioning type problems can be modeled as integer linear programs (LP). Foreach subset s ∈ S, let xs be a binary variable indicating whether or not subset s is partof the solution, and let aws be a binary parameter that takes value 1 if subset s containsw ∈ W and 0 otherwise. Denote by H the index set of the side constraints, and foreach constraint h ∈ H , let dh be its right-hand side and bhs be the coefficient of xs ,s ∈ S, in this constraint. A generic formulation for set partitioning type problems isgiven by

Minimize∑

s∈Scs xs (1.1)

subject to:∑

s∈Saws xs = 1, ∀w ∈ W (1.2)

∑

s∈Sbhs xs = dh, ∀h ∈ H (1.3)

xs ∈ {0, 1}, ∀s ∈ S. (1.4)

The objective function (1.1) seeks at minimizing the total cost. The set partitioningconstraints (1.2) indicate that each element w ∈ W must belong to exactly one selec-ted subset. Constraints (1.3) are the side constraints, which may also involve extravariables. Note that, to improve readability, we will use in the following the wordcolumn to refer to a subset of S and we will say that a column contains elements of W .

As it is most often the case for real-life applications, we assume throughout the paperthat model (1.1)–(1.4) is bounded. Moreover, we assume without loss of generality thatthere exists a partial ordering of the elements in W which is sufficient to completelyorder the elements of any column s ∈S. For instance, if the elements are tasks toaccomplish at specific times such as flights, such an ordering may be given by thechronological order of the task starting times. If no such natural ordering exists, one canalways choose an arbitrary order, such as the order of the set partitioning constraintsin the model (when transmitted to the solver). Consequently, we consider that theelements in a column are ordered.

123

Multi-phase DCA for set partitioning type problems 347

Model (1.1)–(1.4) can be solved by a branch-and-bound method which requiressolving several linear relaxations (at least one per branch-and-bound node) during thesolution process. These LPs can be solved by the simplex algorithm which becomesinefficient when the constraint matrix is not sparse enough. In fact, in numerical expe-riments, one can note that the basis typically contains a large percentage of variablesthat have value zero (hereafter called degenerate variables) and the algorithm exe-cutes many degenerate pivots, that is, the objective value does not improve with thesepivots.

The number of constraints in a LP has a greater impact on the computing perfor-mance of the simplex algorithm than its number of variables. Curiously, to reducesolution times, several techniques such as partial pricing and column generation havebeen proposed in the literature for programs involving a large number of variables,but very few have been developed for programs with a large number of constraints.Recently, Elhallaoui et al. [10] addressed this latter issue in the case where the LPscorrespond to the linear relaxations of set partitioning type models. They introduced aniterative dynamic constraint aggregation (DCA) method which relies on an aggregatedproblem. This aggregated problem is derived from the original problem by aggregatingits set partitioning constraints into subsets of constraints (called clusters) and retaininga single representative constraint for each cluster. It involves only the columns thatare compatible with the current aggregation. A column is said compatible if, for eachcluster, it has either aws = 1 for all w that have been aggregated in this cluster, or it hasaws = 0 for all these w. These compatible variables can be pivoted into the basis of theaggregated problem without modifying the aggregation. The other variables are saidto be incompatible and their incompatibility can be ranked according to the numberof additional clusters that would need to be defined in order to make these variablescompatible with the aggregation. To pivot an incompatible variable into the basis ofthe aggregated problem, the aggregation must be modified by increasing the numberof clusters and, thus, the size of this problem. To ensure the exactness of the solutionapproach, the aggregation is updated during the solution process either by breakingup some of the clusters using a subset of incompatible variables or by combiningtogether some of the clusters in such a way that the non-degenerate basic variablesremain compatible with the resulting aggregation. Note that, in the first case, usinghighly incompatible variables yields larger size increases of the aggregated problemthan using slightly incompatible variables.

The DCA method, combined with column generation, was tested by Elhallaouiet al. [10] on the linear relaxation of instances of the simultaneous mass transit vehicleand crew scheduling problem (VCSP). Even if they used very simple aggregation anddisaggregation strategies, this first implementation succeeded in reducing the solutiontime by a factor ranging from 2 to 5 for instances involving up to 1,600 set partitioningconstraints. The innovation behind this new methodology is to project the feasibledomain into a lower dimensional space by removing degenerate basic variables. Thisprojection allows to aggregate the identical constraints in the reduced space, yieldingan aggregated problem. The basis size is thus reduced, resulting into faster pivoting andless frequent degenerate simplex iterations. Note that the magnitude of this reductiondepends on the number of degenerate basic variables: in general, a larger number ofthese variables yields a larger reduction.

123


In this paper, we exploit this observation to further accelerate the DCA method. Todo so, we propose to use a partial pricing strategy that prices out, during most of thesolution process, only the variables that are compatible or slightly incompatible withthe current constraint aggregation. On the one hand, this strategy tends to maintainor increase the number of degenerate basic variables when compatible variables enterthe basis, allowing to further aggregate the problem. On the other hand, it avoids afast increase in the aggregated problem size when breaking up clusters because thisbreak up is induced by only slightly incompatible variables. As the solution processprogresses, more and more incompatible variables are priced out. Obviously, to ensurethe exactness of the overall solution approach, complete pricing is used towards the endof the solution process. This evolving pricing strategy has given rise to an improvedDCA method that we call the multi-phase DCA method (MPDCA).

The main contributions of this paper are as follows. First, we improve the DCAmethod by developing the MPDCA method. Second, to reach a larger readership,we show that the DCA method applies beyond the context of a column generationalgorithm. Third, we discuss how this new method can be combined with columngeneration for solving vehicle routing and crew scheduling problems. Fourth, we carrysome theoretical analysis that can provide some insights into the proposed method andexplain partially the good performance of the MPDCA method. In particular, we studythe reduction factor of the expected number of bases obtained by using constraintaggregation and show that some conditions appearing with the new pricing strategyguarantee non-degenerate simplex pivots. Finally, to illustrate the efficiency of theMPDCA method, we provide computational results obtained for the linear relaxationof instances of the VCSP. These results show that, on average, the MPDCA method is4.5 times faster than the DCA method for the largest tested instances (with 2,000 setpartitioning constraints).

This paper is organized as follows. The next section briefly reviews the literature ondegeneracy and on the methods for reducing the number of constraints. Section 3 pre-sents the motivations behind this work. In Sect. 4, we introduce the MPDCA algorithmand present the DCA algorithm as a special case. We also discuss how to combineMPDCA with column generation for VCSP. In Sect. 5, we present our theoreticalinsights. In Sect. 6, we report computational results obtained on randomly generatedinstances of the VCSP. Conclusions are drawn in Sect. 7.

2 Literature review

In this section, we first review the literature that directly deals with degeneracy inlinear programming. Then, we focus on methods that aim at speeding up the solutionprocess of a LP by reducing the number of constraints to consider.

2.1 Degeneracy in linear programming

When the simplex method is used to solve a LP, the outbreak of degenerate basic solu-tions (that is, solutions with degenerate basic variables) during the solution process canyield iterations where the objective value does not change. Because of this degeneracy

123


phenomena, the algorithm can even cycle indefinitely. Cycling is very rare in practice.However, stalling due to a large number of consecutive degenerate simplex pivots isfrequent for certain applications, especially, for those modeled as set partitioning typeproblems.

Literature on degeneracy is abundant (for instance, see the volume edited by Gal[12]). To improve the performance of the simplex algorithm in presence of degeneracy,several pivoting rules (see the survey by Terlaky and Sushong [26]) have been propo-sed, including the Devex rule [18], the steepest-edge rule [16], and Bland’s rule [3].To avoid cycling, Charnes [4] introduced a perturbation method that modifies, oncefor all, the right-hand side members of the constraints. This method is, in theory, equi-valent to the lexicographic method later developed by Dantzig et al. [6]. Wolfe [28]proposed an ad hoc version of the perturbation method that perturbs a subset of theright-hand side members (those associated with the degenerate basic variables) onlywhen degeneracy occurs. This procedure, which needs to be invoked recursively, hasbeen tested by Ryan and Osborne [24] on an aircrew scheduling problem and provedto be very efficient. Other dynamic methods have also been proposed (for instance,see [1,11]).

All these methods to deal with degeneracy work on a full-size problem. As discus-sed next, other methods avoid degeneracy by reducing the number of constraints toconsider.

2.2 Reduction in the number of constraints

Given a minimization LP, Lagrangian relaxation [14] transfers, using Lagrangian mul-tipliers, a subset of the constraints into the objective function to define a subproblem.For any given admissible values of the multipliers, the optimal value of this subpro-blem is a lower bound on the optimal value of LP. The problem of finding multipliervalues yielding the largest lower bound is called the Lagrangian dual problem and issolved using a non-smooth optimization algorithm. Surrogate relaxation [15] is in thesame spirit. The subproblem is however defined by replacing all the constraints of LPby one or several surrogate constraints (i.e., non-negative linear combinations of theconstraints) and the surrogate dual problem consists of finding optimal weights of thelinear combinations. Both approaches have several similarities, including the fact thatthey can provide asymptotically the optimal value of LP, but often without producinga feasible solution for LP.

As surveyed in Rogers et al. [23], various constraint aggregation methods havebeen proposed in the literature before the 1990s. However, in most cases, the aggre-gation is static (i.e., it does not change throughout the whole solution process) andyields approximate solutions that may also be infeasible. In those cases, contributionsoften concern the analysis of the error due to the loss of information ensuing from theaggregation and how to define a good aggregation. On the other hand, Mendelssohn[21] and Shetty and Taylor [25] developed iterative (dynamic) constraint aggregationmethods that can yield optimal solutions. The method of Mendelssohn [21] is specificto LPs arising in Markov decision processes. It solves at each iteration an aggrega-ted problem and a series of linear programming subproblems that allow to compute

123


estimated values for all the dual variables of the original LP. These values are then usedto define the aggregated problem of the next iteration. Shetty and Taylor [25] addressedthe case of a general LP. Their method consists of aggregating the constraints once anddisaggregating successively the violated ones whenever the solution of the aggregatedproblem violates some of the constraints. Lower and upper bounds on the optimalvalue are also computed and can be used to prematurely halt the solution processat an approximate solution. Test results on problems involving up to 200 constraintsshowed that substantial reductions in computational time can be realized when theprocess is stopped within 1% of optimality. They were not particularly successfulwhen optimality was sought.

As mentioned earlier, Elhallaoui et al. [10] recently pursued the theoretical work ofVilleneuve [27] to develop an exact DCA method for set partitioning type problems.This iterative method reduces the number of set partitioning constraints by aggregatingsome of them and updates this aggregation when needed. Further details about thismethod are given in Sect. 4.

Before Elhallaoui et al., Pan [22] proposed a modified simplex method that hassome similarities with the DCA method. This method uses a basis of reduced sizeobtained by setting aside a certain number of constraints that are redundant withthe other constraints when only the non-degenerate variables are considered. It startswith a full-size feasible basis, reduces it, and then performs simplex-like pivots. Thebasis size increases with some of these pivots and remains unchanged with the others.Consequently, this method favors non-degenerate and fast pivots at the beginning of thesolution process when the size of the basis is relatively small. This desired behavior isgradually lost as the size of the basis increases. Thus, the main differences between thealgorithm of Pan and that of Elhallaoui et al. is that the size of the aggregated problemsused in the latter algorithm can decrease during the solution process and, also, thealgorithm of Elhallaoui et al. is specialized and only applicable to set partitioning typeproblems.

3 Motivations

It is well known that set partitioning problems are generally very degenerate problems,that is, the average number of bases per extreme point of the feasible domain is verylarge. In fact, degeneracy grows very rapidly with the problem size. Thus, resortingto aggregated problems (containing less constraints, variables, and non-zero elementsper column) as in the DCA algorithm for solving large instances can certainly diminishthe impact of degeneracy and reduce the number of iterations. Furthermore, pivoting inaggregated problems is much faster than pivoting in non-aggregated problems, yieldingfaster average computation time per iteration. Consequently, it seems advantageous touse aggregated problems as in the DCA algorithm.

In the DCA algorithm, the size of the aggregated problems varies throughout thesolution process as it depends on the current constraint aggregation. This aggregationis partially disaggregated when the reduced cost of an incompatible variable seemsmore advantageous than the reduced cost of any compatible variable. It is furtheraggregated only when the size of the aggregated problem becomes too large, that is,

123


when degeneracy might start to considerably slow down the solution process. Oneweakness of the DCA algorithm is that the choice of the incompatible variables usedfor disaggregating the aggregation is made without considering their impact on thesize of the resulting aggregated problem. Hence, when highly incompatible variablesare chosen, the size of the aggregated problem increases rapidly. Another weaknessof this algorithm is that it does not take advantage of a good initial aggregation (ascomputational results will show in Sect. 6.1) because it often prioritizes incompatiblevariables over compatible variables right at the start of the solution process and, inthis way, moves away from the initial aggregation.

To improve the DCA algorithm, we thus propose to control the variables that canenter the basis and be used to disaggregate the constraint aggregation. In fact, theMPDCA algorithm relies on a partial pricing strategy that favors the compatible andslightly incompatible variables over the highly incompatible variables, resulting in aslower disaggregation process than without this strategy. Limiting the pricing to onlythe compatible variables at the beginning of the solution process also allows to obtainvery rapidly good integer solutions derived from the initial constraint aggregation.Moreover, the slow disaggregation process tends to produce linear relaxation solutionsthat are less fractional than those obtained with a faster disaggregation process. Finally,it also facilitates the re-optimization of the aggregated problems after disaggregation.

4 Multi-phase dynamic constraint aggregation

This section introduces the MPDCA algorithm for solving the linear relaxation of theset partitioning type model (1.1)–(1.4). Unlike Elhallaoui et al. [10] who describedthe DCA algorithm in a column generation context, we present the MPDCA methodin a context without column generation and briefly discuss afterwards issues that arisewhen column generation is used.

4.1 Basic concepts

Constraint aggregation is performed according to a partition Q of W into orderedclusters. An ordered cluster is a subset of the elements of W in which the elements areordered. Denote by L the set of the ordered clusters and by Wl the ordered elementsin cluster l ∈ L . Q = {Wl : l ∈ L} is a partition of W if Wl1 ∩ Wl2 = ∅ for alll1, l2 ∈ L , l1 �= l2, and

⋃l∈L Wl = W . Such a partition is called an aggregating

partition. Now, let us define the notion of variable compatibility with respect to anaggregating partition.

Definition 4.1 Given an aggregating partition Q into a set L of ordered clusters,column s ∈ S is said to be compatible with cluster l ∈ L if it covers all elements inWl or none of them. If column s is compatible with all clusters in L , then s and itsassociated variable xs are said to be compatible with partition Q. The columns andvariables that are not compatible with Q are said to be incompatible.

The algorithm does not directly work with the linear relaxation of model (1.1)–(1.4).It rather uses a sequence of aggregated problems obtained by changing the aggregating

123


partition. An aggregated problem (APQ) for a partition Q of the elements into a set Lof ordered clusters is derived from the original model by discarding, for each clusterl ∈ L , all set partitioning constraints (1.2) defined for the elements in Wl except one.All variables incompatible with Q are removed from the model. Therefore, APQ

contains a subset of the set partitioning constraints (1.2) and, among the variables,only those that are compatible with Q.

The MPDCA algorithm proceeds through a sequence of phases. Each phase cor-responds to a different level of partial pricing to be used. Such a level is definedaccording to a number of incompatibilities that a column s ∈ S can have with respectto the current aggregating partition Q into a set L of ordered clusters. This numbercan be mathematically defined as follows.

Definition 4.2 Given an aggregating partition Q and a column s, the number of incom-patibilities of s with respect to Q is given by

k Qs =

∑

l∈L

κls,

where κls is equal to 1 if column s covers some of the elements in Wl , but not allof them, and 0 otherwise. A column s and its associated variable xs are said to bek-incompatible with respect to Q if k Q

s = k. Compatible columns and variables arealso qualified as 0-incompatible variables.

A phase of the MPDCA algorithm is defined as follows.

Definition 4.3 The MPDCA algorithm is said to be in phase k when, among thevariables xs , only those that are p-incompatible with p ≤ k are priced out by thesimplex algorithm. k is called the phase number.

The sequence of phases that the algorithm goes through is predetermined and, toensure the exactness of the algorithm, it must terminate with a phase where all variablesare priced out. Because � |W |

2 � is an upper bound on k Qs for all aggregating partitions Q

and all columns s ∈ S, the last phase can always be chosen as phase � |W |2 �. Depending

on the application, other final phases can be chosen.

4.2 Algorithm

Figure 1 presents a flowchart of the MPDCA algorithm. This algorithm starts withan initial aggregating partition Q that is typically derived from a heuristic solutionor logical reasoning. Such an initial aggregating partition is considered good if itsclusters group together elements that have a high probability of being in a column ofan optimal linear relaxation solution.

Then, the MPDCA algorithm is made up of two types of iterations: minor and majorones. A minor iteration (steps 2–8) starts by partially solving APQ (step 2), that is,a series of simplex pivots is performed on APQ . The number of pivots to execute ineach minor iteration can depend on various stopping criteria. For instance, one can

123


2k= ?|W|

Q

Yes

No

Compute disaggregated dual variables

(1)

(2)

Create an initial partition Q and set phase k=0

(3)

(5)

(8)

Change the partition?

Yes

(9)

Perform simplex pivots on AP Q

Update partition Q and AP accordingly

(7)

Stop: optimal solution found

Mino

r

Major

iterat

ion

iterat

ion

Negative reduced cost?No

No

Yes

(4) Price the p−incompatible variables with p < k

uand the y variables k=k+1

(6)

Fig. 1 The multi-phase dynamic constraint aggregation method

stop when a given maximum number of pivots is reached, when the decrease in theobjective value in the last (a given number) simplex pivots is considered insufficient,or when the number of degenerate variables exceeds a given threshold. These simplexpivots yield a primal and a dual solution for APQ . Because this dual solution does notinclude dual values for the set partitioning constraints discarded from APQ , a dualvariable disaggregation procedure (step 3), which computes these dual values fromthe aggregated dual solution, is then invoked before pricing out the variables. Dualvariable disaggregation can be done in various ways. As suggested by Elhallaoui et al.[10], it should produce disaggregated dual values such that the reduced costs of thecompatible variables remain unchanged and those of a large subset I of incompatiblevariables (which can be chosen arbitrarily) are non-negative. These conditions definea system of linear equalities (one for each cluster) and inequalities (one for eachvariable in I) that can be infeasible. To identify a feasible system and compute asolution for it, Elhallaoui et al. proposed a procedure that relies on the assumptionthat the elements of W are ordered in the columns and the clusters. This assumptionallows to classify the incompatible variables and restrict the incompatible variables inI to certain classes. After performing a variable substitution, the authors showed thatthe linear system corresponds to the optimality conditions of a shortest path problem.Starting from a selected subset I of incompatible variables, their iterative procedureconsists of reducing heuristically this subset until the corresponding system becomesfeasible, and finding a feasible solution for the resulting system. At each iteration,a feasible shortest path problem is solved. If it is unbounded, an arc in a negative-cost cycle is removed before starting a new iteration. Otherwise, its dual optimalsolution is a feasible solution to the resulting linear system. It should be noted that the

123


ordering assumption is only needed for identifying which incompatible variables canbe included in I to yield the optimality conditions of a shortest path problem. BecauseI can be chosen arbitrarily, the ordering of the elements in W can also be arbitrary.The procedure of Elhallaoui et al. was used for our tests.

Once a disaggregated dual solution is computed, all p-incompatible variables withp ≤ k are priced out (step 4), where k is the current phase number. If none of thesevariables has a negative reduced cost (step 5), a test is performed in step 6 to verifyif complete pricing was used, that is, if the current phase is the last phase. If not,the phase number is increased in step 7 and pricing is performed again in step 4 on alarger set of variables. Otherwise, the algorithm stops since the current primal solutionfor APQ is an optimal solution of the original problem. When negative reduced costvariables exist among the priced variables, a simple test (step 8) decides whether or notpartition Q should be changed. This test is deemed positive in two cases: (i) the ratio ofthe smallest compatible variable reduced cost over the smallest incompatible variablereduced cost is less than a prespecified value (disaggregation is needed) or (ii) the ratio|Q||W | exceeds a given threshold value and the objective function value has decreasedsince the last partition update (aggregation is needed). The first case compares thepotential (evaluated by the reduced cost) of improving the objective value by pivotinginto the basis an incompatible variable with that of a compatible variable. In practice,the prespecified value is less than one to further favor pivots on compatible variables.The second case simply prevents the aggregated problem to become too large and,thus, more sensible to high degeneracy. The strict decreasing condition is needed toavoid cycling. When the test in step 8 is negative, another minor iteration is started.When it is positive, partition Q is changed in step 9 to conclude a major iteration.

A major iteration (steps 2 to 9) thus consists of a series of minor iterations and anupdate of the aggregating partition Q. In such an update, the partition is aggregatedand/or disaggregated according to the conditions that are fulfilled in the test of step 8to yield a new partition Q. When only the first condition is met, the new partition isobtained by breaking up some of the clusters of the current partition to make compatiblea set I of incompatible columns (those for which the corresponding variables have themost negative reduced costs). When only the second condition is met, the new partitionis the one with the minimum number of clusters such that the non-degenerate basicvariables are compatible. Finally, when both conditions are met, the new partitionensures that the non-degenerate basic variables and the columns in the above set I arecompatible.

The DCA algorithm can be obtained from the MPDCA algorithm by setting k to� |W |

2 � in step 1. In this case, all variables are always priced out in step 4 and thealgorithm stops the first time it reaches step 6. Consequently, the MPDCA algorithmimproves on the DCA algorithm by executing phases with small k values. With thispartial pricing strategy, the solution process focuses most of the time on variablesthat are compatible or slightly incompatible with the current aggregating partition.Note that the set of variables to price out does not only change when the phase numberchanges, but also every time that partition Q is updated. Note also that the phase numbercould be updated differently, including updating procedures that do not necessarilyincrease the phase number monotonously. For instance, certain phases can be skippedwhen they do not often yield additional negative reduced cost variables after going

123


through the first phases. The procedure that we used is discussed at the end of thefollowing subsection.

Restricting the set of variables to price out during the early phases (those withsmall k values) of the MPDCA method has three major impacts on the solution pro-cess. First, since the variables used to disaggregate the aggregating partition in step8 contains a small number of incompatibilities, the partition is slowly disaggregated.This allows a smooth transition from one major iteration to the next. Second, whenonly compatible or slightly incompatible variables can enter the basis in the simplexalgorithm, there are high chances to introduce additional degenerate variables in theaggregated problem (which has a smaller size than the original problem). This givesthe opportunity to further aggregate the partition in step 8. In this case, the thresholdvalue that is compared to the ratio |Q|

|W | to decide if the partition must be aggregatedcan be reduced to avoid large amplitudes in the partition size. Finally, when the initialpartition is built from a good feasible solution (that can be an integer solution if theintegrality gap is known to be small), the MPDCA algorithm usually retrieves thissolution very rapidly in phase k = 0, positioning itself relatively close to an optimalsolution. This speeds up the solution process.

4.3 Specialization for column generation

When the MPDCA algorithm is used in conjunction with column generation, thefollowing points are worth mentioning. In step 2, the problem that is aggregated isthe restricted master problem. Therefore, it seems natural, as proposed in Elhallaouiet al. [10], to completely solve it. In this case, a minor iteration simply correspondsto a column generation iteration. The most important point is certainly that, in acolumn generation context, not all variables are known a priori. They must be gene-rated from the subproblems in step 4. Depending on the algorithm used to solvethese subproblems, both compatible and incompatible variables may be generated atthe same time. In this case, the generation of incompatible variables may impede thegeneration of compatible variables, possibly yielding a faster disaggregation ofthe aggregating partition. Also, a limited number of generated incompatible variablesmay restrict the efficiency of the dual variable disaggregation procedure (step 3) andthe accuracy of the test used to decide when the partition should be updated (step 8).

Furthermore, an additional restriction must be imposed in the column generationsubproblem to ensure that only the variables having at most k incompatibilities arepriced out in phase k. Although this restriction modifies the subproblem definition, itmay often be possible to deal with it exactly or heuristically. For instance, in severalvehicle routing and crew scheduling problems (such as the one used for the tests inSect. 6), the subproblem is a shortest path problem with resource constraints solvableby dynamic programming (see [8,20]) in pseudo-polynomial time, where a feasiblepath defines a column of S and elements of W represent tasks to accomplish once. Aresource is a quantity that varies along each arc of a path according to predefined arcextension functions. At each node of a path, this quantity is constrained to take a valuein a predefined resource interval. Typical resources are the time, the total vehicleload, and the number of hours flown by a pilot in a workday (see [7,13]). In such

123


applications, the partial pricing strategy of the MPDCA algorithm can be controlled inthe subproblem by adding a constrained resource which cumulates an approximationk Q

s of the number of incompatibilities along a path s. In phase k of the algorithm,this resource is restricted at each node of the underlying network to take a value in[0, k]. The addition of this resource generally speeds up the subproblem solution timein the early phases (k = 0, 1, 2) of the MPDCA algorithm because, in those phases,the resource intervals [0, k] substantially reduce the subproblem feasible domain.

We propose to use an approximation k Qs for the number of incompatibilities along

a path s because the computation of the exact number k Qs cannot be achieved as the

sum of values associated with the arcs forming this path. This computation would thusrequire that a state (which represents a partial path) contains all tasks encountered sofar and, in this case, the worst-case complexity of the dynamic programming algorithmwould become exponential. The proposed approximation requires that a state retainsonly the last task encountered along a partial path, thus reducing considerably thestate space and preserving the pseudo-polynomiality of the dynamic programmingalgorithm. This approximation considers that a column is incompatible with an orderedcluster as soon as this column does not cover consecutively the elements of this clusterin the appropriate order. It is computed as follows.

Let ρ : W → W ∪ {NIL} (resp. σ : W → W ∪ {NIL}) be a predecessor (resp.successor) function that associates with a task w ∈ W its predecessor (resp. successor)task in the ordered clusters or the NIL value if w has no predecessor (resp. successor).Denote by ts the number of tasks covered by path s and by ν1

s , ν2s , . . . , ν

tss the sequence

of these tasks in path s. Then, the approximate number k Qs of incompatibilities of path

s with respect to an aggregating partition Q is computed by the following algorithm:

k Qs = 0

for i = 1, 2, . . . , ts do

if (ρ(νis) �= NIL and (i = 1 or ρ(νi

s) �= νi−1s ) then k Q

s = k Qs + 1

if (σ(νis) �= NIL and (i = ts or σ(νi

s) �= νi+1s ) then k Q

s = k Qs + 1.

Denoting by L1 ⊆ L the set of singletons in L , it is easy to prove that

k Qs =

∑

l∈L\L1

|Wl |−1∑

i=1

κ ils,

where

κ ils =

⎧⎪⎪⎨

⎪⎪⎩

1 if s contains wil or wi+1

l but not both,

2 if s contains both wil and wi+1

l but not consecutively or not in this order,

0 otherwise.

Furthermore, because κls ≤ ∑|Wl |−1i=1 κ i

ls for all l ∈ L\L1 and κls = 0 for all l ∈ L1,

the inequality k Qs ≤ k Q

s holds for all partitions Q and paths s ∈ S.

123


Table 1 Examples of incompatible columns and their numbers of incompatibilities

Column s Ordered elements in s k Qs k Q

s

s1 1-1, 1-2, 1-3, 1-4, 2-1 1 1

s2 1-2, 1-3, 2-1, 2-2, 2-3 1 2

s3 1-1, 1-2, 2-3 2 2

s4 1-2, 1-3, 2-3 2 3

s5 1-2, 1-4 1 3

s6 1-1, 1-2, 2-1, 2-2, 1-3, 2-3, 1-4 0 6

Table 1 gives the exact and approximate numbers of incompatibilities (k Qs and

k Qs ) for six examples of columns (denoted from s1 to s6) for a problem involving

seven elements that are partitioned into two clusters comprising in order the elements1-1, 1-2, 1-3, and 1-4 for the first cluster, and 2-1, 2-2, and 2-3 for the second one.From these examples, we see that the approximations can be very good in some cases(columns s1 to s4), but also very bad in others (column s6). However, when the initialpartition is very good, the columns yielding bad approximations should not be part ofan optimal solution and, therefore, overestimating their number of incompatibilities isnot harmful to the solution process. Furthermore, the use of a partial pricing differentfrom the one proposed without column generation does not hinder the exactness ofthe overall algorithm.

Note that the approximate number of incompatibilities k Qs can take a value larger

than � |W |2 �, but not larger than 2|W | − 2. Hence, when column generation is used, the

equality in step 6 of the algorithm needs to be replaced by k = 2|W | − 2.Note also, that for our computational experiments, we did not go through all phase

numbers from 0 to 2|W |−2. Instead we used the following sequence of phase numbers:0, 1, 2, and 2|W | − 2. Phases 3, 4, . . . , 2|W | − 3 were skipped simply because theydo not often yield additional negative reduced cost variables after going through thefirst three phases.

5 Theoretical insights

In this section, we establish theoretical observations that provide insights explainingthe efficiency of the MPDCA method. First, we discuss the convergence of the MPDCAmethod. Second, we estimate the impact of constraint aggregation on the expectednumber of feasible bases arising in a set partitioning problem. Finally, we show thatsome conditions appearing with the new pricing strategy guarantee non-degeneratesimplex pivots.

5.1 Convergence

The following observation is straightforward from the facts that the partial pricingstrategy of the MPDCA algorithm ends up in full pricing and, if need be, the aggregatedproblem can be fully disaggregated.

123


Observation 5.1 Assuming that the primal simplex algorithm used to solve the aggre-gated problems includes an anti-cycling strategy that takes care of degeneracy, theMPDCA algorithm requires a finite number of simplex pivots to find an optimal solu-tion to the set partitioning type model (1.1)–(1.4).

5.2 Reduction of the expected number of bases

In this section, we are interested in the relation between the number of bases of a fullydisaggregated set partitioning problem and the number of bases of an aggregated pro-blem. More specifically, we obtain bounds on the expected number of feasible bases ofthe disaggregated problem that can be generated from a feasible basis of an aggregatedproblem. To obtain this result which is stated below in Theorem 5.3, we consider arandom set partitioning problem whose coefficient matrix is an m × n 0-1 matrix ofdensity λ ∈ (0, 1) denoted A = (

ai j). The coefficients ai j correspond to independent

random Bernoulli variables that have the following probability distribution: ai j = 1with probability λ and ai j = 0 with probability 1 − λ. In the following, a column ofA is said to be fixed (resp. random) if the values of the random coefficients it containsare known (resp. unknown).

The following lemma is required to prove Theorem 5.3.

Lemma 5.2 Let A1, A2, . . . , Ak−1 be k − 1 fixed columns of A that are linearlyindependent, and Ak a random column of A. Then, the probability that A1, A2, . . . , Ak

be linearly independent is at least equal to 1 − λm−k+1, where λ = max {λ, 1 − λ}.Proof Without loss of generality, assume that A1, A2, . . . , Ak are the first k columnsof A and, since A1, A2, . . . , Ak−1 are linearly independent, assume also that the sub-matrix D = (

ai j)

1≤i, j≤k−1 is non-singular. Denote by � = (A j )1≤ j≤k the matrix

composed of these k columns and by Dk−1l , l ∈ {k, . . . , m}, the submatrix of �

composed of its first (k − 1) rows and its row l.Let us compute an upper bound on the probability that Ak is linearly dependent on

the vectors A1, A2, . . . , Ak−1. Since D is non-singular, Ak is linearly dependent onA1, A2, . . . , Ak−1 if and only if each row l ∈ {k, . . . , m} of � linearly depends on itsfirst k − 1 rows. Developing the determinant of Dl with respect to its last column, thelatter condition is equivalent to

det(Dl) = al,k det(D) + cl = 0, ∀l ∈ {k, . . . , m},

where det( ) is the determinant function and cl , l = k, . . . , m, is a random value thatdepends on the values of aik , i = 1, 2, . . . , l − 1, but not on that of al,k .

Because det(D) �= 0, the equation

z det(D) + cl = 0

has exactly one solution z∗l for each l ∈ {k, . . . , m}. Furthermore, since the probabi-

lity that al,k = z∗l is equal to λ if z∗

l = 1, to 1 − λ if z∗l = 0, and 0 otherwise, the

123


probability that det(Dl) = 0 cannot exceed max {λ, (1 − λ)} = λ. Hence, the proba-bility that det(Dl) = 0, ∀l ∈ {k, . . . , m}, or equivalently that Ak linearly depends onA1, A2, . . . , Ak−1, is at most equal to λm−k+1. Consequently, the probability that thevectors A1, A2, . . . , Ak are linearly independent is at least 1 − λm−k+1. �

The following theorem considers the bases of a completely disaggregated problemand those of an aggregated problem. These bases are termed complete and aggregatedbases, respectively. Also, we say that a complete basis is generable from an aggrega-ted basis if it contains the disaggregated version of the columns associated with theaggregated basis.

Theorem 5.3 Consider a randomly generated set partitioning problem (SPP) invol-ving m constraints and n variables and an aggregated problem derived from it contai-ning ma constraints and na variables (ma < m and na ≤ n). For each feasibleaggregated basis B, the expected number of feasible complete bases generable from

B is �(

Cn−mam−ma

), where Ci

j is the number of possible combinations of j elements from

a set of i elements.

Proof Let A be the m × n coefficient matrix of SPP and, without loss of generality,assume that, once disaggregated, the columns of B correspond to A1, A2, . . . , Ama ,the first ma columns of A. To obtain a feasible basis for SPP, one has to select m − ma

columns, denoted hereafter Ama+1, Ama+2, . . . , Am , from the set of n − ma columnsof A \ {A1, A2, . . . , Ama } and verify that the m columns A1, A2, . . . , Am are linearlyindependent. Because there is a maximum number of Cn−ma

m−maways to select these

columns, the expected number of complete bases generable from B cannot exceedCn−ma

m−ma.

Now, let us determine a lower bound on this expected number. Denote by Ek theevent that the columns A1, A2, . . . , Ak are linearly independent. From probabilitytheory, we know that the probability that Ek occurs, denoted P(Ek), can be expressedas

P(Ek) = P(Ek−1)P(Ek | Ek−1),

where P(E | F) denotes the conditional probability that event E occurs knowing thatevent F occurred. The probability that a basis is obtained by adding m − ma randomcolumns Ama+1, Ama+2, . . . , Am to the set of the ma fixed columns A1, A2, . . . , Ama

corresponds to the probability P(Em) which can be written

P(Em) = P(Em−1)P(Em | Em−1)

= P(Em−2)P(Em−1 | Em−2)P(Em | Em−1)

= ...

= P(Ema )

m∏

k=ma+1

P(Ek | Ek−1).

123


Since, by assumption, P(Ema ) = 1 and, by Lemma 5.2, P(Ek |Ek−1) ≥ 1−λm−k+1

for k = ma + 1, . . . , m, we obtain that

P(Em) ≥m∏

k=ma+1

(1 − λm−k+1).

Consequently, the expected number of bases generable from B is greater than or equalto Cn−ma

m−ma

∏mk=ma+1(1 − λm−k+1).

To complete the proof, we need to show that∏m

k=ma+1(1 − λm−k+1) is boundedbelow by a positive value that does not depend on m or ma . First, note that

∏mk=ma+1

(1 − λm−k+1) ≥ ∏∞k=1(1 − λk). Then, let Sr = ∏r

k=1(1 − λk) and Tr = ln Sr =∑r

k=1 ln(1 − λk). Because limk→∞ln(1 − λk+1)

ln(1 − λk)= λ < 1, the sequence {Tr }∞r=1

converges toward a numberβ according to the d’Alembert ratio test. Thus, the sequence{Sr }∞r=1 converges toward eβ , a positive value independent of m and ma . This proves

that the expected number of bases generable from B belongs to[eβCn−ma

m−ma, Cn−ma

m−ma

]

or, equivalently, that this number is �(

Cn−mam−ma

). �

From this theorem, we can easily deduce the following corollary.

Corollary 5.4 Let X be an extreme point of a set partitioning problem where dvariables are non-degenerate. If an aggregated problem involving ma = d constraintscan be used at this point, the expected number of bases associated with X drops from

�(

Cn−dm−d

)to 1.

Since n is much larger than m in practical applications, Corollary 5.4 indicates that

DCA could reduce the number of degenerate pivots by a factor as large as �(

Cn−dm−d

).

Indeed, the use of aggregated bases avoids visiting a large number of complete basesthat are represented by a relatively small number of aggregated bases. Finally, notethat when the aggregation is maximal, that is, ma = d, the next simplex pivot is non-degenerate if there exists a compatible variable with a negative reduced cost. In thenext section, we provide further insights on this case.

5.3 Favoring compatible and 1-incompatible variables

In this section, we highlight the potential of substantially reducing the impact ofdegeneracy when favoring compatible and 1-incompatible variables in the pricing stepof the simplex algorithm. To do so, we establish conditions that guarantee a decreasein the objective function value after one simplex pivot when the current solution ofAPQ contains no degenerate basic variables. This analysis is also conducted on pureset partitioning problems.

Let Qt and APQt be the aggregating partition and the associated aggregated problemwhen the t th simplex pivot of the MPDCA method is performed. Denote also by Bt

123


and Xt (resp. Bt and X t ) the basis and basic solution just before (resp. after) thispivot. Note that Bt+1 �= Bt when the partition is modified between the t th and the(t + 1)th pivot. Hence, Xt+1 might be degenerate (with respect to Bt+1) even if X t isnon-degenerate (with respect to Bt ).

The following observation is straightforward from basic linear programming theory.

Observation 5.5 If X t is non-degenerate and there exists a variable compatible withQt that has a negative reduced cost, then the objective function decreases at the tthpivot.

In the MPDCA algorithm, it is often possible to obtain a non-degenerate basicsolution by fully aggregating the partition (that is, redefining the partition with aminimum number of clusters such that the current non-degenerate basic variablesare compatible). In particular, if the basic solution X t−1 obtained after the simplexpivot t − 1 is integer, then aggregating the partition yields a non-degenerate integersolution Xt . When the condition stated in the following proposition holds for severalconsecutive pivots, a sequence of integer solutions can be obtained from this non-degenerate integer solution by fully aggregating the partition after each simplex pivot.Such sequences were often observed in our computational experiments.

Proposition 5.6 If X t is non-degenerate and integer, and if there exists a variablecompatible with Qt that has a negative reduced cost, then X t is also integer.

Proof If Xt is non-degenerate and integer, each basic variable takes value 1 andcontributes (with a coefficient equal to 1) to exactly one set partitioning constraintin the aggregated problem. Hence, by ordering appropriately the basic variables, Bt ,the basis of the aggregated problem, is the identity matrix. In this case, it is easy toverify that entering into the basis a negative reduced cost variable yields a solution X t

in which the entering variable takes value 1 and the other basic variables take eithervalue 0 or 1. �

If X t−1 is not integer, a non-degenerate basic solution Xt may also be obtained byfully aggregating the partition. However, this is not always the case as shown by thefollowing example. Consider that APQt is composed of the set partitioning constraints

⎛

⎜⎜⎜⎜⎜⎜⎝

1 1 0 0 0 0 1 0 10 1 1 0 0 0 1 1 00 0 1 1 0 0 1 0 01 0 1 0 0 0 0 1 01 0 0 1 1 0 0 1 10 1 0 1 0 1 1 0 0

⎞

⎟⎟⎟⎟⎟⎟⎠X =

⎛

⎜⎜⎜⎜⎜⎜⎝

111111

⎞

⎟⎟⎟⎟⎟⎟⎠.

If X t = (0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0)T , then no constraints can be aggregated and,if the partition is not disaggregated, Bt+1 = Bt and Xt+1 (= X t ) is degenerate.Finally, note that, if X t is non-degenerate, the partition cannot be further aggregated.

Before presenting the main result of this section (Theorem 5.7), we show how toderive an augmented basis Bt+1 from the basis Bt when the partition Qt is disaggre-gated using solely one 1-incompatible variable xs′ , s′ ∈ S. To build Bt+1, we propose

123


to adjoin xs′ to the current set of basic variables. Let m be the number of clusters inQt which is equal to the number of constraints in APQt . Without loss of generality,assume that xs′ is only incompatible with the mth ordered cluster Wm . Hence, columns′ does not contain all the elements of Wm . Denote by W 1

m (resp. W 0m) the subset of

the elements of Wm that s′ contains (resp. does not contain). Again, without loss ofgenerality, assume that the cluster W 0

m (resp. W 1m) is associated with the mth (resp.

(m + 1)th) set partitioning constraint in APQt+1 . Finally, denote by ati j , 1 ≤ i, j ≤ m,

the components of the basis Bt .Adding xs′ as the (m + 1)th basic variable, we obtain the augmented basis Bt+1 =

(at+1i j )1≤i, j≤m+1, where

at+1i j = at

i j , ∀i, j = 1, . . . , m, (5.1)

at+1m+1, j = at

m, j , ∀ j = 1, . . . , m, (5.2)

at+1i,m+1 ∈ {0, 1}, ∀i = 1, . . . , m − 1, (5.3)

at+1m,m+1 = 0, at+1

m+1,m+1 = 1, (5.4)

and the exact values of at+1i,m+1, i = 1, . . . , m − 1, are known from column s′. Note

that this basis yields a degenerate solution Xt+1 in which xs′ is the only degeneratevariable.

Now, let us establish conditions which guarantee that the (t + 1)th simplex pivotis non-degenerate when X t is non-degenerate and there exists at least one negativereduced cost variable that has one incompatibility with respect to partition Qt . Weassume that there are no negative reduced cost compatible variables since this casehas been considered in Observation 5.5.

Theorem 5.7 Let Qt be the partition at iteration t, Bt the basis obtained after the tthsimplex pivot, and X t the corresponding solution. Assume that X t is non-degenerate,that there are no negative reduced cost compatible variables, but that there is at leastone 1-incompatible negative reduced cost variable. Let Qt+1 be the partition obtainedby disaggregating Qt using solely xs′ , the 1-incompatible variable with the smallestreduced cost. Let Bt+1 be the basis derived as above from Bt and this variable. Then,the objective value decreases at the (t + 1)th simplex pivot if there exists a negativereduced cost variable compatible with Qt+1.

We prove this theorem using the next three propositions. However, to begin ourproof, we need to discuss the impact of disaggregating the partition on the dual variablevalues. Let αt

i , i = 1, . . . , m, (resp. αt+1i , i = 1, . . . , m + 1) and ct

j , j = 1, . . . , m,

(resp. ct+1j , j = 1, . . . , m + 1) be the dual variable values of the constraints in

APQt (resp. APQt+1 ) and the reduced costs of the basic variables with respect to

Bt (resp. Bt+1). Denote also by ctm+1 the reduced cost of xs′ computed using the

disaggregated dual values after the t th simplex pivot and by πt,0m (resp. π

t,1m ) the sum

123


of the disaggregated dual values of the constraints associated with the elements in W 0m

(resp. W 1m). Note that, as mentioned in Elhallaoui et al. [10],

π t,0m + π t,1

m = αtm . (5.5)

The next proposition establishes a relation between the dual values αt+1i and αt

i .

Proposition 5.8 The dual values αt+1i , i = 1, . . . , m + 1, are:

αt+1i = αi , ∀i = 1, . . . , m − 1 (5.6)

αt+1m = π t,0

m − ctm+1 and αt+1

m+1 = π t,1m + ct

m+1. (5.7)

Proof To prove this statement, it is sufficient to show that the reduced costs

ct+1j = c j −

m+1∑

i=1

αt+1i at+1

i j = 0, ∀ j = 1, . . . , m + 1

for the dual values given by (5.6) and (5.7).From (5.1), (5.2) and (5.5), it is easy to deduce that ct+1

j = ctj = 0 for all j =

1, . . . , m. For j = m + 1, we obtain from (5.3) to (5.5) that

ct+1m+1 = cm+1 −

m+1∑

i=1

αt+1i at+1

i j = cm+1 −m−1∑

i=1

αti at

i,m+1 − π t,1m − ct

m+1

= ctm+1 − ct

m+1 = 0.

�The following proposition allows to bound the reduced costs of certain variables withrespect to the basis Bt+1.

Proposition 5.9 Let xq , q ∈ S, be an arbitrary variable in APQt+1 such that column qcontains the elements in W 1

m but not those in W 0m. Then, its reduced cost (with respect

to the dual values αt+1i , i = 1, . . . , m + 1), denoted ct+1

q , is nonnegative.

Proof From (5.5) to (5.7), we can easily deduce that ct+1q = ct

q − ctm+1. Because xq

has one incompatibility with Qt and xs′ was selected as the 1-incompatible variablewith the smallest reduced cost, we obtain that ct

q ≥ ctm+1 and, consequently, that

ct+1q ≥ 0. �

The following proposition completes the proof of Theorem 5.7.

Proposition 5.10 In APQt+1 , if there exists a variable compatible with Qt+1 that has

a negative reduced cost (with respect to the dual values αt+1i , i = 1, . . . , m + 1), then

the objective value decreases with the (t + 1)th pivot.

123


Proof Let xq , q ∈ S, be such a negative reduced cost compatible variable and denoteby Aq = (aiq)i=1,...,m+1 its vector of coefficients associated with the constraintsof APQt+1 . According to Proposition 5.9, (amq , am+1,q) �= (0, 1). Consequently,(amq , am+1,q) ∈ {(0, 0), (1, 1), (1, 0)}. Now, assume that xq is the entering variablefor the (t + 1)th pivot. To prove that this pivot is non-degenerate, we will show thatxs′ (the only degenerate basic variable in the current solution) cannot be the leavingvariable.

Consider Bt+1 = (at+1i j )1≤i, j≤m+1 the current basis as constructed above and

(Bt+1)−1 = (bt+1i j )1≤i, j≤m+1 its inverse matrix. To compute (Bt+1)−1, we can trans-

form the augmented matrix [Bt+1 | I ] into the matrix [I | (Bt+1)−1] using elementaryrow operations. In particular, the last row of this second matrix can be directly obtai-ned by subtracting the mth row of the first matrix from its (m + 1)th row. Doing so,we find that bt+1

m+1,m = −1, bt+1m+1,m+1 = 1 and bt+1

m+1, j = 0 for j = 1, . . . , m − 1.

Consequently, the (m + 1)th component of (Bt+1)−1 Aq is equal to −amq + am+1,q ,that is, it is equal to either 0 or -1 according to the possible values for amq and am+1,q

mentioned above. Thus, xs′1 cannot be the leaving variable and the pivot cannot bedegenerate. �

Observation 5.5 and Theorem 5.7 suggest that restricting the pricing to the compa-tible and 1-incompatible variables while aggregating as much as possible the partitionhas a high potential for reducing degeneracy. In the MPDCA algorithm, restricted pri-cing is obviously performed in phases 0 and 1. It is also performed in all other phaseswhen the algorithm executes a series of simplex pivots on the same aggregated problem(step 2 in Fig. 1) and when the algorithm decides to keep the same partition in step8 because it seems preferable to progress with the compatible variables only. On theother hand, for aggregating the partition as much as possible, the MPDCA algorithmshould execute a small number of pivots in step 2 and favor updating the partition instep 8. However, in practice, we noticed that the manipulation time required to update(aggregate or disaggregate) the partition is non-negligible and often exceeds the timerequired to execute a simplex pivot on a sufficiently aggregated problem. Therefore,it seems more efficient to avoid aggregating the partition too often at the expense ofexecuting some degenerate pivots, the partition being aggregated only when the num-ber of degenerate basic variables becomes high. Since degeneracy is not too impedingwhen the number of degenerate variables is relatively small as shown in Sect. 5.2,such a strategy offers an interesting tradeoff between the partition handling time andthe time spent performing degenerate pivots.

6 Computational experiments

In this section we report computational results to compare the performances of theDCA method of Elhallaoui et al. [10] and the proposed MPDCA method for sol-ving large-scale set partitioning type instances. Both methods are tested in a columngeneration context on the linear relaxation of instances of the VCSP, which can bebriefly stated as follows (see Haase et al. [17] for details). Given a set of timetabledbus trips and a set of possible driver relief points along those trips which divide them

123


into segments, find minimum-cost valid crew and bus schedules such that all trips arecovered by a bus, all segments and bus empty moves (i.e., repositioning moves withoutpassengers) are covered by a driver, and all collective agreement rules defining validdriver schedules are satisfied.

For our experiments, we used the VCSP instance random generator of Haase et al.[17] and their set partitioning type model. This model contains one variable for eachdriver schedule (these variables are generated from the column generation subpro-blems), one variable to count the number of buses, one set partitioning constraint foreach trip segment and two additional set partitioning constraints for each trip (asso-ciated with a starting and an ending task). It also involves a small number of sideconstraints to ensure that optimal bus schedules can be computed a posteriori in poly-nomial time. In this set partitioning type model, the columns of S are the feasibledriver schedules and the elements of W are the tasks defined for each trip segment andthe start and end of each trip.

All tests were performed on a DELL i386 single processor Redhat Linux machine(Intel Pentium 4, Type i686 CPU, 1.8 GHZ) using the version 4.3 of the column gene-ration software GENCOL (commercialized by Kronos Inc.) which relies on CPLEX,version 7.5, for solving the aggregated problems.

6.1 Impact of the initial partition quality on the MPDCA method

Before comparing the DCA and MPDCA methods, let us discuss the impact of theinitial aggregating partition quality on the performance of both methods. To evaluatethis impact, we conducted the following computational experiment. Three 800-taskinstances of the VCSP were randomly generated as in Haase et al. [17]. For eachinstance, a reference aggregating partition was built by defining one cluster for eachdriver schedule in an integer optimal solution that was previously computed. Sucha partition is considered a very good initial partition because the integrality gap isoften very small for VCSP instances. Then, for each instance and each value of q ∈{1, 2, . . . , 28}, three other initial aggregating partitions were randomly generated byintroducing q random perturbations in the reference partition while preserving thesame number of clusters. A perturbation is created by randomly breaking a clusterinto two parts and randomly concatenating one of the two parts with another cluster.Note that the quality of a partition with q perturbations generally diminishes with thevalue of q. Overall, 85 initial partitions were thus generated for each instance. Next, thelinear relaxation of each of the three instances was solved with the DCA and MPDCAmethods 85 times each using the associated initial partitions. Finally, average solutiontimes were computed for each method and each value of q ∈ {0, 1, 2, . . . , 28}, whereq = 0 corresponds to the reference partitions. The averages were thus computed overthree test runs for q = 0 and nine test runs for the other values of q.

Figures 2 and 3 present the average solution time in seconds (vertical axis) requiredby the DCA and the MPDCA method, respectively, for each number q of perturbationsin a partition. These results show that the DCA algorithm of Elhallaoui et al. [10]exhibits a similar behavior regardless of the quality of the initial partition and doesnot fully take advantage of a very good initial partition. At the opposite, we see that

123


0

50

100

150

200

250

300

350

400

0 5 10 15 20 25 30

Sol

utio

n tim

e (s

)

Number of perturbations

Fig. 2 Impact of the initial partition quality on the DCA algorithm solution time

0

20

40

60

80

100

0 5 10 15 20 25 30

Solu

tion

time

(s)

Number of perturbations

Fig. 3 Impact of the initial partition quality on the MPDCA algorithm solution time

the solution time of the MPDCA method increases as the initial partition qualitydiminishes. The MPDCA algorithm is thus positively influenced by a good initialpartition. Note also that the MPDCA algorithm outperformed the DCA algorithm forall test cases.

6.2 Performance comparison

To assess the performance of the MPDCA method, we conducted computational expe-riments on VCSP instances of different sizes using a standard column generationmethod (STD) without DCA, the DCA method, and the MPDCA method. For eachinstance, the same initial aggregating partition was used for both DCA and MPDCAmethods. In this partition, an ordered cluster is created for each trip and contains inorder the trip starting task, the segment tasks, and the ending task. The results of

123


Table 2 Results for 80-trip VCSP instances with a varying number of segments

Segments (number of tasks)

2 (320) 4 (480) 6 (640) 8 (800)

STD DCA MPDCA STD DCA MPDCA STD DCA MPDCA STD DCA MPDCA

CG iter 48 79 61 81 148 68 138 231 70 207 328 90

Part chg 63 42 72 53 102 54 110 74

AP constr 353 264 164 513 310 177 673 382 182 833 483 206

Fract var 92 72 42 117 112 64 146 131 41 150 164 76

SP time (s) 2 4 5 14 24 13 44 73 25 110 168 58

AP time (s) 11 6 1 81 11 1 307 36 1 901 109 3

Total time (s) 15 12 7 98 42 18 357 124 34 1, 020 301 73

Table 3 Results for large VCSP instances (eight segments per trip)

Trips (number of tasks)

120 (1,200) 160 (1,600) 200 (2,000)

STD DCA MPDCA STD DCA MPDCA STD DCA MPDCA

CG iter 480 906 226 618 939 433 798 1,040 242

Part chg – 295 165 – 324 342 – 577 226

AP constr 1,250 763 423 1,664 1,083 549 2,084 1,139 507

Fract var 269 235 184 343 278 139 543 490 248

SP time (s) 676 1,638 515 2,393 3,980 2,688 3,932 5,294 2,056

AP time (s) 5,395 1,120 89 21,903 7,152 1,021 53,957 5,479 308

Total time (s) 6,093 2,827 647 24,337 11,250 3,830 57,948 11,201 2,475

these experiments are reported in Tables 2 and 3 and correspond to averages overthree different instances for all test cases. For each case, we provide the number ofcolumn generation iterations (CG iter), the number of aggregating partition changes(Part chg), the average number of constraints in the aggregated restricted master pro-blem (AP const), the number of fractional variables in the linear relaxation solution(Fract var), the time spent solving the column generation subproblems (SP time), thetime spent solving the aggregated restricted master problems (AP time), and the totaltime. All times are in seconds.

Table 2 reports the results for 80-trip VCSP instances involving different numbers ofsegments per trip, while Table 3 focuses on larger VCSP instances involving 8 segmentsper trip and comprising between 320 and 2,000 tasks (set partitioning constraints).These results indicate that, for the largest instances (640 tasks or more), the MPDCAmethod is between 4.5 and 2.9 times faster than the DCA method, and between 23.4 and6.3 times faster than the STD method. For the smaller instances, the MPDCA methodalso performs better than the DCA method, but the speedup factors are smaller.

123


Table 4 Results on integrality

STD DCA MPDCA

Number of tests 21 276 276

Number of integer solutions 20 259 275

Number of integer solutions with gap <0.05% 0 6 177

Number of integer linear relaxation solutions 0 2 35

Average best solution gap (%) 72 63 3

The gains in solution time realized by the MPDCA method over the DCA methodare mostly due to a small number of column generation iterations and a small numberof constraints in the AP. To see this, let us compare, for example, the results for the lar-gest instance (last two columns of Table 3). For this instance, the number of iterationswas reduced by more than 75%, while the number of constraints in the AP droppedby more than 50%, resulting in reductions of 94 and 61% of the time spent solvingthe AP and the subproblems, respectively. For all instances, the average number ofaggregating partition changes per iteration has increased. This can be explained by thefact that the incompatible columns used to disaggregate the partition are more often 1-incompatible variables. Therefore, more partition changes are needed to reach the finalpartition. Finally, observe that the number of fractional variables in the linear relaxationsolution has dropped considerably because of the smaller size of the AP. This indicatesthat the search for an integer solution using a branch-and-bound algorithm might beeasier when it starts from the linear relaxation solution computed by the MPDCAalgorithm.

This last observation led us to analyze more closely the computation of integersolutions during the solution process. For all the tests that we conducted for thispaper (21, 276 and 276 with the STD, DCA and MPDCA methods, respectively), wegathered the statistics reported in Table 4 which are in order: the overall number ofinteger solutions obtained during the column generation process, the number of thesesolutions with a gap smaller than 0.05% (compared to the linear relaxation optimalvalue), the number of linear relaxation solutions that were integer, and the average gap(in percentage) of the best integer solution found during the linear relaxation solutionprocess when at least one was found. These results clearly show that, while solving alinear relaxation, the MPDCA algorithm produces integer solutions of better qualitymore often than the two other methods. In particular, we notice that integer solutionswith a gap less than 0.05% were produced for 64% of the tests and integer linearrelaxation solutions were obtained for 20% of them.

We attribute these impressive results to the fact that a good initial aggregatingpartition can easily be derived for the VCSP. We believe that good initial partitionscan also be obtained for many other applications to yield similar speedups. Indeed, inseveral crew scheduling problems such as the VCSP, crews do not often change vehiclesin the middle of their workdays. In this case, a good initial partition can be computedfrom a previously computed vehicle schedule. Another frequent situation where agood initial partition can be found is when a planned solution to a given problem has

123


to be reoptimized because of a relatively small perturbation in the input data. In thiscase, one often wants to compute a new solution that is similar to the planned solution,which can then provide a good initial partition. More generally, for any set partitioningproblem, a good initial partition can be derived from a good heuristic solution thatcan even be slightly infeasible. Obviously, when no good initial aggregating partitionis available for a specific problem, impressive gains in computational times might notbe expected.

7 Conclusion

In this paper, we proposed the MPDCA algorithm which is a substantial improvementof the DCA method of Elhallaoui et al . [10]. It incorporates a partial pricing strategythat favors the variables compatible or slightly incompatible with the current aggrega-ting partition, yielding a slower disaggregation process. This strategy allows to takeadvantage of a good initial partition and reduces solution times substantially. Testresults on instances of the VCSP showed that the MPDCA is up to 4.5 times fasterthan the DCA method, and up to 23.4 times faster than the STD method. Moreo-ver, the MPDCA often produces very good quality integer solutions during the linearrelaxation solution process. To support these impressive empirical results, we analyzedtheoretically some crucial steps of the MPDCA algorithm to provide some insights onits performance.

Future research will concentrate on integrating this new method in a branch-and-bound process and experimenting it on real-world set partitioning type problems ari-sing in various domains.

Acknowledgments This work was supported by an NSERC grant. The authors are thankful to twoanonymous referees for making several suggestions that improved the presentation of the paper.

References

1. Balinski, M.L., Gomory, R.E.: A mutual primal-dual simplex method. In: Graves, R.L., Wolfe, P. (eds.)Recent Advances in Mathematical Programming, pp. 17–28. McGraw-Hill, New York (1963)

2. Balinski, M.L., Quandt, R.E.: On an integer program for a delivery problem. Oper. Res. 12, 300–304 (1964)

3. Bland, R.G.: New finite pivoting rule for simplex method. Math. Oper. Res. 2, 103–107 (1977)4. Charnes, A.: Optimality and degeneracy in linear programming. Econometrica 20, 160–170 (1952)5. Cordeau, J.-F., Desaulniers, G., Lingaya, N., Soumis, F., Desrosiers, J.: Simultaneous locomotive and

car assignment at VIA Rail Canada. Transp. Res. B 35, 767–787 (2001)6. Dantzig, G.B., Orden, G.B., Wolfe, P.: The generalized simplex method for minimizing a linear form

under linear inequality restraints. Pac. J. Math. 5, 183–195 (1955)7. Desrochers, M., Desrosiers, J., Soumis, F.: A new optimization algorithm for the vehicle routing

problem with time Windows. Oper. Res. 40, 342–354 (1992)8. Desrochers, M., Soumis, F.: A generalized permanent labeling algorithm for the shortest path problem

with time Windows. INFOR 26, 191–212 (1988)9. Desrochers, M., Soumis, F.: A column generation approach to the urban transit crew scheduling pro-

blem. Transp. Sci. 23, 1–13 (1989)10. Elhallaoui, I., Villeneuve, D., Soumis, F., Desaulniers, G.: Dynamic aggregation of set partitioning

constraints in column generation. Oper. Res. 53, 632–645 (2005)

123


11. Fletcher, R.: A new degeneracy method and steepest-edge-based conditioning for LP. SIAMJ. Optim. 8, 1038–1059 (1998)

12. Gal, T., (ed.): Degeneracy in optimization problems. Ann. Oper. Res. 46/47, 1–7 (1993)13. Gamache, M., Soumis, F., Marquis, G., Desrosiers, J.: A column generation approach for large scale

aircrew rostering problems. Oper. Res. 47, 247–263 (1999)14. Geoffrion, A.M.: Lagrangean relaxation for integer programming. Math. Program. Study 2, 82–114

(1974)15. Glover, F.: Surrogate constraints. Oper. Res. 16, 741–749 (1968)16. Goldfarb, D., Reid, J.K.: A practicable steepest-edge simplex algorithm. Math. Program. 12, 361–371

(1977)17. Haase, K., Desaulniers, G., Desrosiers, J.: Simultaneous vehicle and crew scheduling in urban mass

transit systems. Transp. Sci. 35, 286–303 (2001)18. Harris, P.M.: Pivot selection methods of the devex LP code. Math. Program. 5, 1–28 (1973)19. Hoffman, K.L., Padberg, M.: Solving ariline crew scheduling problems by branch-and-cut. Manage.

Sci. 39, 657–682 (1993)20. Irnich, S., Desaulniers, G.: Shortest path problems with resource constraints. In: Desaulniers, G.,

Desrosiers, J., Solomon, M.M., (eds.) Column Generation, pp. 33–65. Springer, New York (2005)21. Mendelssohn, R.: An iterative aggregation procedure for Markov decision process. Oper. Res. 30,

62–73 (1982)22. Pan, P.-Q.: A basis deficiency-allowing variation of the simplex method for linear programming. Com-

put. Math. Appl. 36(3), 33–53 (1998)23. Rogers, D.F., Plante, R.D., Wong, R.T., Evans, J.R.: Aggregation and disaggregation techniques and

methodology in optimization. Oper. Res. 39, 553–582 (1991)24. Ryan, D.M., et Osborne, M.: On the solution of highly degenerate linear programmes. Math. Program.

41, 385–392 (1988)25. Shetty, C.M., Taylor, R.W.: Solving large-scale linear programs by aggregation. Comput. Oper.

Res. 14, 385–393 (1987)26. Terlaky, T., Sushong, Z.: Pivot rules for linear programming: a survey on recent theoretical develop-

ments. Ann. Oper. Res. 46, 203–233 (1993)27. Villeneuve, D.: Logiciel de Génération de Colonnes, Ph.D. Dissertation. Université de Montréal,

Canada (1999)28. Wolfe, P.: A technique for resolving degeneracy in LP. SIAM J. 2, 205–211 (1963)

123

multi-phase dynamic constraint aggregation for set partitioning type problems

Documents