Progressive hedging and tabu search applied to mixed integer (0,1) multistage stochastic programming

Download Progressive hedging and tabu search applied to mixed integer (0,1) multistage stochastic programming

Post on 10-Jul-2016




3 download


<ul><li><p>Journal of Heuristics, 2:111-128 (1996) (~) 1996 Kluwer Academic Publishers </p><p>Progressive Hedging and Tabu Search Applied to Mixed Integer (0, 1) Multistage Stochastic Programming ARNE LOKKETANGEN Molde College, Britvn. 2, 6400 Molde, Norway email: </p><p>DAVID L. WOODRUFF Graduate School qt'Management, UC Davis, Davis, CA 95616, USA email: </p><p>Abstract </p><p>Many problems faced by decision makers are characterized by a multistage decision process with uncertainty about the future and some decisions constrained to take on values of either zero or one (for example, either open a facility at a location or do not open it). Although some mathematical theory exists concerning such problems, no general-purpose algorithms have been available to address them. In this article, we introduce the first imple- mentation of general purpose methods for finding good solutions to multistage, stochastic mixed-integer (0, 1) programming problems. The solution method makes use of Rockafellar and Wets' progressive hedging algorithm that averages solutions rather than data. Solutions to the induced quadratic (0, 1) mixed-integer subproblems are obtained using a tabu search algorithm. We introduce the notion of integer convergence for progressive hedg- ing. Computational experiments verify that the method is effective. The software that we have developed reads standard (SMPS) data files. </p><p>Key Words: stochastic programming, integer programming, progressive hedging, tabu search </p><p>Many problems faced by decision makers are characterized by a multistage decision process with uncertainty about the future, recourse, and some decisions constrained to take on values of either zero or one (for example, either open a facility at a location or do not open it). By multistage we mean that there are a series of decisions to be made. By the word recourse we mean that at each additional decision point additional information is available (that is, some random variables have been realized). Since these decision stages typically correspond to time points, we index them by t = 1 . . . . , rand call this index set T. Even though we are required only to implement decisions needed "right now," we want to take into account the fact that we will be making additional decisions in the future. In other words, we may want to "hedge" against uncertainty about the future. </p><p>By incorporating integer variables, uncertainty, and recourse we enable very realistic models for the extremely common situations where systems must be designed and operated over a period of t ime and the design may evolve if needed. For example, production and distribution formulations such as those found in textbooks (e.g., Fourer, Gay, and Kernighan, 1993, Sec. 4.3) can be extended to take into account fixed costs associated with </p></li><li><p>112 LOKKETANGEN AND WOODRUFF </p><p>opening or closing facilities by using integers and more realistic modeling of the multistage decision process by including uncertainty and recognizing that future decisions can depend on information that will become available in the future. For example, we may decide to open a smaller facility now if know that we can decide to open another one in the future after we have new information. Similar opportunities abound in problems concerning telecommunications network design, oil field development, real estate development, and host of other disparate problem domains. </p><p>At the heart of this increased modeling power is the representation of uncertain events using random variables. In theory, these random variables can take on a very large number of values; it may not be reasonable or useful to consider distribution functions. It may not be reasonable because there may not be sufficient data to estimate entire multivariate distributions (especially with no reason to expect or require independence). It may not be useful because in many cases the essence of the stochastics can be captured by specifying a reasonable number of representative scenarios. We assume that scenarios are specified giving a full set of random variable realizations and a corresponding probability. We index the scenario set S by s and refer to the probability of occurrence of s (or, more accurately, a realization "near" scenario s) as Pr(s). Let the number of scenarios be given by S. </p><p>For each scenario s and each stage t we are given a row vector c(s, t) of length n(t), a re(t) x n(t) matrix A(s, t), and a column vector b(s, t) of length m(t). Let N(t) be the index set 1 . . . . . n(t) and M(t) be the index set 1 . . . . . m(t). For notational convenience let A(s) be (A(s, 1) . . . . . A(s, r)) and let b(s) be (b(s, 1) . . . . . b(s, r)). _.;.The decision variables are a set of n(t) vectors x(t) with one vector for each scenario. </p><p>Notice that the solution is allowed to depend on the scenario. Let X(s) be (x(s, 1) . . . . . x(s, r)). We will use X as shorthand for the entire solution system of x vectors (that is, (X=x(1, 1) . . . . . x(S, r)). </p><p>If we were prescient enough to know which scenario would be the realization (call it s) and therefore the values of the random variables, we would want to minimize </p><p>f(s; X(s)) =- ~_, ~ [ci(s, t)xi(s, t)] (P.,,) tET i~N(t) </p><p>subject to </p><p>A(s)X(s) &gt; b(s) </p><p>xi(s,t) E {0,1}i E l ( t ) , t E T </p><p>xi(s, t) &gt; 0 i E N(t) - l (t), t E T, </p><p>(1) </p><p>(2) </p><p>(3) </p><p>where l(t) defines the set of integer variables in each time stage. Discounting could easily be added. The notation AX is used to capture the usual sorts of single period and period linking constraints that one typically finds in multistage linear programming formulations. </p><p>Since we are not prescient, we must require solutions that do not require foreknowledge and that will be feasible no matter which scenario is realized. We refer to solution systems </p></li><li><p>PROGRESSIVE HEDGING AND TABU SEARCH 113 </p><p>that satisfy constraints with probability one as admissible. We refer to a system of solution vectors as implementable if, for scenario pairs s and s' that are indistinguishable up to time t, it is true that xi(s, t') =xi(s ' , t') for all 1 &lt; t' &lt; t and each i in each N(t). We refer to the set of all such solution systems as N's for a given set of scenarios S </p><p>rain ~[Pr(s)f(s; X(s))] (P), sES </p><p>subject to </p><p>A(s)X(s) &gt; b(s) s E S </p><p>xi(S,t) E {O, 1}i E I ( t ) , t E T, s E S </p><p>x i (s , t ) &gt;0 i E N( t ) - l ( t ) , t E T ,s E S </p><p>x ~Ns. </p><p>(4) </p><p>(5) </p><p>(6) </p><p>(7) </p><p>Unless time travel becomes possible, only solutions that are implementable are useful. Solutions that are not admissible, on the other hand, may have some value. Although some constraints may represent laws of physics, others may be violated slightly without serious consequence. The progressive hedging algorithm (PH) ensures implementable solutions at all iterations and admissibility on convergence. </p><p>Although ours is the first reported use of PH for multistage integer problems, there have been some applications to other problems. Mulvey and Vladimirou have reported success solving network problems (see, e.g., Mulvey and Vladimirou, 1991, 1992). Helgason and Wallace (1991) (as well as Wallace and Helgason, 1991) have reported success solving fishery problems and have suggested the use of tree-based data structures for managing PH data. </p><p>Approaches other than PH are possible for some types of integer problems. A straight- forward solution method is based on a reformulation called the deterministic equivalent (DE). The DE is produced by reformulating the original problem so that dependence on scenarios is eliminated implicitly. This will be defined in a rigorous fashion after more nota- tion has been developed, but for now we note that for mixed-integer problems the resulting instance will typically be too large to solve exactly. Additional approaches have been dis- cussed for some classes of stochastic programming problems with integer variables in the first stage (see, e.g., Averbakh, 1990) and for classes of two-stage problems with integers (see Laparte and Louveaux, 1993). </p><p>However, we are interested in the more general case where there can be any number of integer variables in any and all stages and where any of the data can be stochastic. We also are interested in methods that can accommodate more than two stages and instances that are potentially too large for exact methods, even for one scenario. In order to have all of these features, we must abandon exact methods and make use of heuristics. </p><p>In Section 1 we describe progressive hedging and the tabu search algorithm used to solve subproblems. Computational experiments are discussed in Section 2. The final section offers conclusions and directions for further research. </p></li><li><p>114 LOKKETANGEN AND WOODRUFF </p><p>1. Solution methods </p><p>1.1. Progressive hedging </p><p>The progressive hedging algorithm proposed by Rockafellar and Wets (199 l) is intuitively appealing and has desirable theoretical properties: it converges to global optimum in the convex case, there is a linear convergence rate in the case of a linear stochastic problem, and if it converges in the nonconvex case (and if the subproblems are solved to local optimality), then it converges to a local optimal solution. We describe our application of PH here. For a more general description see Rockafellar and Wets (1991). </p><p>To begin the PH implementation for (P), we organize the scenarios and decision times into a tree. The leaves correspond to scenario realizations. The leaves are grouped for connection to nodes at time r. Each leaf is connected to exactly one time r node, and each of these nodes represents a unique realization up to time r. The time r nodes are connected to time z - 1 nodes so that each scenario connected to the same node at time r - I has the same realization up to time r - 1. This is continued back to time 1 (that is, "now"). Hence, two scenarios whose leaves are both connected to the same node at time t have the same realization up to time t. Clearly, then, in order for a solution to be implementable it must be true that if two scenarios are connected to the same node at some time t, then the values of xi (t') must be the same under both scenarios for all i and for t' _&lt; t. </p><p>To illustrate the notation developed thus far, we consider a very small example with three decision epochs (so r = 3 and T = {1,2, 3}) and two decisions per period, one of which is integer and the other of which is bounded above zero and only three additional constraints. For any given scenario s the problem (Ps) is to minimize </p><p>cl (s, 1)Xl (s, 1) + c2(s, 1)x2(s, 1) + cl (s, 2)xj (s, 2) + c2(s, 2)Xz(S, 2) </p><p>q- Cl (s, 3)Xl (s, 3) + c2(s, 3)xz(s, 3), </p><p>subject to </p><p>all(s, 1)xl(s, 1) +al2(s, 1)x2(s, 1)O t=1,2,3 . </p><p>Suppose that all data for t = 1 are known and the data for t &gt; 1 will become available before we will have to commit to the decision variables for these times. Further suppose that the only data that are stochastic for t &gt; 1 are c2 and a12. Now suppose that at t = 2 we will have c2 = 2 and al2 = 3, with probability 6 and c2 = 3, a12 = 3.7 as the only other possibility. If the first case occurs, then we will know with certainty that for t = 3 the values of c2 and al2 will both be 6, but in the second case there is a 0.5 chance that they will be </p></li><li><p>PROGRESSIVE HEDGING AND TABU SEARCH 1 15 </p><p>Figure 1. </p><p>F] </p><p>Scenario tree for the small example problem. </p><p>unchanged from t = 2 values and a 0.5 chance that they will both be 6. This is seen much more easily with a scenario tree graph as shown in Figure 1. This graph uses a circle for both nodes and leaves. For each circle, this graph shows the pair (c2, a12) along with the unconditional probability of realizing the scenario(s) for the circle. </p><p>The solution for any particular scenario may not be of any value to us. Ultimately, we want to be able to solve the problem of minimizing the expected objective function value subject to meeting the constraints for all of the scenarios and also subject to the constraint that the solution system be implementable. </p><p>The way we obtain such solutions is to use progressive hedging. For each scenario s, approximate solutions are obtained for the problem of minimizing, subject to the constraints, the deterministic f~, plus terms that penalize lack ofimplementability. These terms strongly resemble those found when the method of augmented Lagrangians is used (Bertsekas, 1982). They make use of a system of row vectors w that have the same dimension as the column vector system X, so we use the same shorthand notation. For example, w(s) means (w(s, 1) . . . . . w(s, r)) for the multiplier system. </p><p>In order to give a formal algorithm statement, we need to formalize some of the scenario tree concepts. We use Pr(.A) to denote the sum of Pr(s) over all s for scenarios emanating from node ..4 (that is, those s that are the leaves of the subtree having ..4 as a root also referred to as s ~ ..4). We use t(.A) to indicate the time index for node ..4 (that is, node A corresponds to time t). </p><p>HS(x';x) Let the operator &lt; mean "assign the result of a heuristic search optimization that </p><p>attempts to find the argument x that minimizes the right side of the expression with the </p></li><li><p>116 LOKKETANGEN AND WOODRUFF </p><p>HS(e;x) search beginning at x'." This is described in detail in the next subsection. We use when the PH algorithm does not suggest a starting solution for the search. </p><p>For typesetting convenience, we consider the constraints (1), (2), and (3) for (P,.) to be represented by the symbol ~s. We use X(t; .4) on the left side of a statement to indicate assignment to the vector (xl(s, t) . . . . . XN(t)(s, t)) for each s 6 .4. We refer to vectors at each iteration using a superscript--for example, w () (s) is the multiplier vector for scenario s at PH iteration zero. The PH iteration counter is k. </p><p>If we defer briefly the discussion of termination criteria, a formal version of the algorithm can be stated as follows taking r &gt; 0 as a parameter. </p><p>1. k +--0. 2. For all scenario indexes, s ~ S </p><p>X(l(s)tts(;X(s)) f (s; X(s)) " X(s) E f2s (8) </p><p>and </p><p>w(~(s) ~ O. </p><p>3. k+- -k+l . 4. For each node .4 in the scenario tree and for t = t (.A), </p><p>A) +-- Z Pr(s)X(t; s)(k-l) /Pr(.A). sEA </p><p>5. For all scenario indexes s 6 S, </p><p>w(k)(s) +-- w(k-I)(S) -~ (r)(X(k-l)(s) --x(k-l)(s)) </p><p>and </p><p>X ~ (s) Hs(x'~ -~ %);xc) L (X(s)) + w (t) (s)X(s) + r/2 IIX(s) - 2 ~ - 1 (s)I12 :X(s) E ~"2 s . (9) </p><p>6. If the termination criteria are not met, then go to Step 3. </p><p>The termination criteria are based mainly on convergence, but we must also terminate based on time because nonconvergence is a possibility. Iterations are continued until k reaches some predetermined limit K or the algorithm has converged, which we take to mean 5t'k (s) is sufficiently close to ~(~-1 (s) for all s. One possible definition of"sufficiently close" is to require the distance (for example, Euclidean) to be less than some parameter. A much better choice is to consider only integer components of X" and require equality. This requires no metric o...</p></li></ul>


View more >