multi-stage stochastic linear programming: …multi-stage stochastic linear programming: scenarios...

Multi-stage Stochastic Linear Programming:Scenarios Versus Events

C. Beltran-Royo∗ L. F. Escudero† R. E. Rodriguez-Ravines‡

17/05/2010

Abstract

To solve the multi-stage linear programming problem, one may use adeterministicor a sto-chasticapproach. The drawbacks of the two techniques are well known: the deterministic approachis unrealistic under uncertainty and the stochastic approach suffers from scenario explosion. Weintroduce a new scheme, whose objective is to overcome both drawbacks. The focus of this newscheme is oneventsinstead ofscenariosand for this reason we call it Multi-stage Event LinearProgramming (ELP). As we show in the theoretical results and in the preliminary computationalexperiments, the ELP approach represents a promising compromise between the stochastic and thedeterministic approach, regarding capacity to deal with uncertainty and computational tractability.

Keywords Multi-stage stochastic linear programming, white noise, time series, scenario tree,news vendor problem.

Mathematics Subject Classification (2000) 90C05, 90C15.

1 Introduction

To address the Multi-stage Linear Programming (MLP) problem under uncertainty, one can use dif-ferent approaches such as stochastic programming [6], chance constraint approaches [24] and robustoptimization [22, 20], among others. In this paper we concentrate on the stochastic programming ap-proach, which considers all the possible futures orscenariosunder a probabilistic framework. As inliterature, we will use the term Multistage Stochastic Linear Programming MSLP or for short, SLP, torefer to the MLP problem solved by stochastic programming. The relevance of the SLP approach iswell-known in the decision community [16]. The difficultes to solve (multistage) SLP instances arealso well known, even in the case of a finite number of possible future scenarios. A first difficultycorresponds to build a tractable scenario tree representative of the underlying stochastic sequence[12, 17, 14]. Another difficulty arises, for example, when one considers integer variables [26]. In thispaper, we only consider continuous variables.

In the literature we find exact and approximate methods to solve the SLP model. Exact methods aremainly based on augmented Lagrangian relaxation [10, 25], Benders decomposition [11, 13], and in-terior point [4, 21, 31], among others. If the number of scenarios becomes too large exact methods

∗Corresponding author, [email protected], Statistics and Operations Research, Rey Juan Carlos University, Madrid,Spain.

†[email protected], Statistics and Operations Research, Rey Juan Carlos University, Madrid, Spain.‡[email protected], Bayes-Forecast, Madrid, Spain.

1

are impractical. In this case, either one solves the SLP model approximately or one solves an appro-ximation to the SLP model. This is the case of schemes such as scenario reduction [15], scenariosampling [23, 7], scenario refinement [8], aggregation of constraints or stages [9], and SLP with deci-sion rules [32, 19], among others. However, even an approximated solution of the SLP model by theSample Average Approximation method requires an exponential number of sampled scenarios in orderto attain a reasonable accuracy [28].

In this context it is useful to have some cost bound in order to assess the quality of approximated solu-tions. Bounds based on Jensen and Edmundson-Madansky inequalities can be found in [6]. Probablythe most popular type of bounds in stochastic linear programming are obtained by constraint or rowaggregation [5, 34, 18]. Of course, other approaches can be used, as for example to infer statisticalbounds as in [27].

The objective of this paper is to introduce a new approach that can give an effective lower bound basedon constraint aggregation, together with a good solution for the MLP problem under uncertainty. Thefocus of this new approach is oneventsinstead ofscenariosand for this reason we call it Multi-stageEvent Linear Programming (ELP).

To deal with uncertainty, the ELP approach represents a compromise between two extremes: SLP andMulti-stage Linear Deterministic Programming (DLP). On the most realistic extreme we have the SLPapproach, where we have an exponential number of scenarios in order to represent the uncertainty.On the most unrealistic extreme we have the DLP approach, where we have one single scenario (theaverage scenario). In the ELP approach, the potentially huge scenario tree of the SLP approach isreplaced by a sequence of small two stage trees, which we call theevent wood. The uncertaintyassociated to each future stage is represented by means of one of these two-stage trees. Of course theELP representation of uncertainty is more simple than the SLP one. Thus, the ELP falls in betweenthe SLP and DLP approaches regarding computational effort and capacity to represent the uncertainty.The question is, can we get a convenient balance of these two opposite objectives by using the ELPapproach, namely, computational tractability and accuracy?

The remainder of the paper is organized as follows. Section 2 presents the stochastic control problemwe wish to solve. In Section 3 we introduce a new concept: the independent white noise scenario tree.Section 4 introduces the ELP approach, intended to solve the stochastic control problem presented inSection 2 and based on the independent white noise scenario tree concept. In Section 5 we study a newSLP cost bound given by the ELP approach. The results of a preliminary computational experimentare reported in Section 6.

2 The stochastic control problem

Inspired by [32], to introduce the Multi-stage Event Linear Programming (ELP) approach, we finduseful to consider the following deterministic control problem with horizonT , which is a simple caseof Multi-stage Linear Programming (MLP) that we namePMLP :

minu,s

zMLP :=∑t∈eT

ct · ut +∑t∈T

Lt(st) (1)

s. t. s1 = s0, (2)

st = Atst−1 + Btut−1 + Ct t ∈ T , (3)

u ≥ 0. (4)

In this problem

2

• ut accounts for decisions at the end of staget.

• st accounts for the system state prior to take decisionut.

• T = {1, . . . , T }, T = T \ {1} andT = T \ {T }

• ct · ut stands for the scalar product of vectorsct andut.

• Lt is the picewise linear loss function:

L(st) = c−t · [st − St]− + c+t · [st − St]+,

with c− ≥ 0, c+ ≥ 0, [x]+ = max{0, x}, [x]− = min{0, x}, St is a target vector for staget.

• At andBt are matrices of appropriate dimensions andCt is a vector.

In real life instances, some of the parameters ofPMLP can be stochastic. We collect all the stochasticproblem data in vectorξt. For example, if all the problem data are random we will have

ξt := (ct, c−t , c+

t , At, Bt, Ct).

If dada involves matrices, then their elements can be stacked columnwise to make it a vector. We willuseξt to denote a realization of the random vectorξt.

Assumption 1 The stochastic data sequence{ξt}t∈T is independent of the stochastic decision se-quence{ut}t∈eT

.

We assume through this paper that the decision att is followed by the unfolding of uncertainty att+1.We have the sequence:

[decision]1 → [chance move]2 → [decision]2 → . . .

We also assume thatξ1 is deterministic. The sequential structure of the problem implies that thedecision at staget must be contingent to the historyξ[t] = (ξ1, . . . ξt). This implies thatut is afunction of ξ[t] and we will writeut(ξ[t]). Considering equation (3), the state vectorst must also bemade contingent toξ[t].

In order to introduce the ELP approach, we consider the simplest stochastic version ofPMLP , whereCt is the only random data (ξt = Ct)

minu, s

zMLP := E

∑t∈eT

ct · ut(ξ[t]) +∑t∈T

Lt(st(ξ[t]))

(5)

s. t. s1 = s0, (6)

st(ξ[t]) = At st−1(ξ[t]) + Bt ut−1(ξ[t]) + Ct(ξ[t]) t ∈ T , (7)

u(ξ[t]) ≥ 0, (8)

3 Independent white noise scenario tree

The Multi-stage Event Linear Programming (ELP) approach is based on a new concept: the indepen-dent white noise scenario tree. To introduce this new concept, we remind the concepts of white noiseand time series.

3

3.1 White noise

A one-dimensional white noise [30] is a sequence of random variables that are uncorrelated, havemean zero, and a finite variance (which is denotedσ2). Formally,{εt} is a one-dimensional whitenoise sequence ifE[εt] = 0, E[ε2t ] = σ2, andE[εtεt′ ] = 0 for t 6= t′.

A multi-dimensional white noise, also denoted by{εt}, is a random sequence of random vectorsεt =(ε1t , . . . , ε

pt ) such that:E[εt] = 0, E[εtε

′t′ ] = Σε, andE[εtε

′t′ ] = 0 for t 6= t′, where matrixΣε is the

common covariance matrix containing the covariancesσij = cov[εitε

jt ]. Note that variablesεi

t may becorrelated over the variable identifieri, but are uncorrelated over time.

A common slightly stronger condition is thatεt andεt′ are independent from one another; this is anindependent white noise sequence. Often one assumes a normal distribution forεt; these arenormallydistributedor Gaussian white noise sequences.

In this paper we will only consider a discrete white noise. If the white noise is continuous, we willdiscretize it as in Section 6. This will allow us to approximate thePMLP by means of a scenario tree.Of course, the quality of the approximations will depend on the quality of the discretization of thecontinuous white noise.

3.2 Time series

Usually, time series are modeled as functions of a white noise{εt}, as for example in the models calledautoregressive, moving average, ARMA, ARIMA, etc. [30] (very often, the underlying white noise isassumed to be independent). To illustrate this idea we give the definition of the ARMA (autoregressivemoving average) mode. A time series{ξt}t∈Z is said to be ARMA(p, q) ifξt is stationary and

ξt(ε[t]) = φ1ξt−1 + · · ·φpξt−p + εt + θ1εt−1 + · · ·+ θqεt−q

with φp > 0, θq > 0.

Assumption 2 1. The random sequence{ξt} in PMLP corresponds to a vector time seriesξt =(ξ1

t . . . , ξpt ) which contains as its components,p univariate time series (demand, resources, etc.).

2. We assume that there is an underlying independent white noise{ετ}τ∈T , such thatξt is a fun-ction ofε1, . . . εt. We will writeξt(ε1, . . . εt) or for short,ξt(ε[t]) whereε[t] = (ε1, . . . εt) is thehistory up to staget. We also assume thatε1 is deterministic and equal to 0.

3. For anyεt with t ∈ T , we assume a finite supportSεt = {εtl}l∈Lt , whereLt = {1, . . . ,Lt}.The associated probability function isfεt(εtl) = P (εt = εtl) = πtl.

4. For anyξt with t ∈ T , we assume a finite supportSξt . The associated probability function isfξt .

One consequence of Assumption 2 is thatε[t] andξ[t] will also have a finite supportsSε[t] andSξ[t] ,respectively.

3.3 Independent white noise scenario tree

In the case of a random sequence{ξt(ε[t])} with finite number of possible realizations, one representthem by means of a scenario tree in order to proceed with numerical calculations [16, 29]. As we willsee later, it will be useful, to consider the scenario tree associated to the underlying independent whitenoise{εt}.

4

Figure 1: Node labels. The scenario tree nodes are indexed by a tripletkllabel (staget, groupk and leafl).

Definition 1 Independent white noise scenario tree:The scenario tree associated to an independentwhite noise{εt}. In this particular scenario tree, we will assume that for allt ∈ T , the number ofsuccessors (leaves) for each node of staget is constant and equal toLt.

The scenario tree depicts all the possible realizations of the white noise{εt}. By construction ofthe scenario tree, each realizationε[t] of ε[t] corresponds to a node of the scenario tree, say,n[t] andviceversa. We say thatn[t] is a realization ofn[t], the random node at staget. Also, we identifyn[t]

with ε[t] andn[t] with ε[t]. Therefore, we have

P(n[t] = n[t]

)= P

(ε[t] = ε[t]

).

To introduce the Multi-stage Event Linear Programming (ELP) approach, we find it convenient to re-index nodesn[t] of the scenario tree byntkl, wheret is the stage,k is the group of events (nodes withthe same immediate ancestor) andl the event. For example, in Fig. 1 we observe that at stage 2 wehave only one group of events (n111 is the common ancestor) and at stage 3 we have two groups ofevents. In general, at each staget ∈ T the groups of events are indexed byk ∈ Kt := {1, . . . ,Kt}.Variables and parameters associated to each node are re-indexed accordingly.

At each nodentkl we have the following elements (see Fig 2):

• The vectorxtkl = (utkl, stkl), which accounts for the decision and state vector.

• ξtkl, the realization of the random vectorξt, which collects all the data parameters of the SLPmodel at staget.

• ptkl, the probability of reaching nodentkl, that is,

ptkl = P(n[t] = ntkl

).

For all staget ∈ T , we have that∑

kl∈KtLtptkl = 1.

• qtkl, conditional probability of reaching nodentkl from its ancestor node, denoted asna(tkl).That is,

qtkl = P(n[t] = ntkl | n[t−1] = na(tkl)

).

5

Figure 2: Scenario tree. At staget = 1, we have only one node, whichaccounts for the observed eventξ111 and the current decision and state vectorx111. At each staget > 1, each nodentkl accounts for its probabilityptkl, itsconditional probabilityqtkl, the posible eventξtkl, and the decision and statevectorxtkl.

At each pairtk ∈ TKt, we have that∑

l∈Ltqtkl = 1.

Proposition 1 In an independent white noise scenario tree

1. qtkl = πtl for all tkl ∈ TKtLt, where{πtl} is the probability distribution associated toεt.

2. ptkl = pa(tkl)πtl for all tkl ∈ TKtLt.

Proof:

1. Taking into account that{εt} is an independent white noise and assuming thatε[t] = (ε1,l(1), . . . , εt−1,l(t−1), εtl)is the realization ofε[t] that univoquely determines nodentkl we have:

qtkl = P(n[t] = ntkl | n[t−1] = na(tkl)

)= P

(ε[t] = ε[t] | ε[t−1] = ε[t−1]

)= P

(ε1,l(1), . . . , εt−1,l(t−1), εtl | ε1,l(1), . . . , εt−1,l(t−1)

)= P (εt = εtl)= πtl

2. ptkl = pa(tkl)qtkl = pa(tkl)πtl.

6

Figure 3: Average scenario tree.At staget = 1, we have one node, whichaccounts for the observed eventξ1 and the current decision and state vectorx1.At each staget > 1, we have only one node, which accounts for the expectedeventξt and the decision and state vectorxt.

4 Three approaches for stochastic linear programming

So far we have assumed independence of the stochastic data sequence with respect to the decisions.Given that by Assumption 2 we assume finiteness of the supportSξ[T ]

, the stochastic control problemPMLP can be formulated as a stochastic linear programming problem. In this section we present threeapproaches to deal with this problem. The first two approaches are classical (the scenario tree approachand the average scenario tree approach) and the third one is new (the event wood approach).

4.1 The scenario tree approach

In the case of a finite supportSξ[T ], we can formulate the deterministic equivalent model ofPMLP

based on the scenario tree representation. This model corresponds to what we have called Multi-stageStochastic Linear Programming (SLP). In this model, each labeltkl refers to nodentkl of the scenariotree.

Definition 2 SLP objective function and domain

zSLP (u, s) =∑

tkl∈eTKtLt

ptkl ct · utkl +∑

tkl∈TKtLt

ptkl(c−t · s−tkl + c+t · s+

tkl), (9)

DSLP = { (u, s) : s111 = s0

stkl = At sa(tkl) + Bt ua(tkl) + Ctkl, tkl ∈ TKtLt (10)

s+tkl − s−tkl = stkl − St tkl ∈ TKtLt, (11)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (12)

wherena(tkl) is the ancestor of nodentkl andTKtLt stands forT ×Kt × Lt.

Definition 3 SLP problem

min zSLP (u, s)s. t. (u, s) ∈ DSLP

4.2 The average scenario tree approach

In the Multi-stage Deterministic Linear Programming (DLP) one has a ‘scenario tree’ with a singlescenario: the average scenario (See Fig 3). In DLP the nodes are just labeled by the staget. At eachnodent we have:

7

Figure 4:Event wood.At each staget > 1, we have a two-stage tree. Its rootnode accounts forxt−1, the average decision and state vector. Each leaf nodetl accounts for one posible eventξtl, for its probabilityπtl and for the decisionand state vectorxtl.

• The vectorxt = (ut, st), which accounts for the decision and state vector.

• Ct, the expected value of the random vectorCt(ε[t]).

Ct = E[Ct(ε[t])] =∑

kl∈KtLt

ptkl Ctkl.

Definition 4 DLP objective function and domain

zDLP (u, s) =∑t∈eT

ct · ut +∑t∈T

(c−t · s−t + c+t · s+

t ),

DDLP = { (u, s) : s1 = s0

st = At st−1 + Bt ut−1 + Ct, tkl ∈ TKtLt

s+t − s−t = st − St t ∈ T,

u ≥ 0, s+ ≥ 0, s− ≥ 0 }.

Definition 5 DLP problem

min zDLP (u, s)s. t. (u, s) ∈ DDLP

4.3 The event wood approach

As we said in the introduction, to deal with uncertainty, the ELP represents a compromise between theSLP and the DLP approaches. The potentially huge scenario tree of the SLP approach, is replaced, inthe ELP approach, by a sequence of two-stage trees, which we call theevent wood(see Fig. 4). Theuncertainty associated to each future stage is represented by means of one of these small trees. Thatis, we have a tree for eacht ∈ T .

Each tree of the event wood has two types of nodes: one root and several leaves. In the tree associatedto staget, the root summarizes in one vector the relevant information from staget− 1. This vector is

xt−1 :=∑

l∈Lt−1

πt−1,l xt−1,l

8

and accounts for the average decision and average state in staget− 1.

On the other hand, each leaf of the tree associated to staget, corresponds to a possible event at staget. We label these leaves by a pairtl ∈ TLt, wheret is the stage andl is the event. The vectors andparameters associated to each leaftl are indexed accordingly. At each leaftl we have three elements:

• The vectorxtl = (utl, stl), which accounts for the decision and state vector.

• πtl = P (εt = εtl), the probability associated to leavetl. For all staget, we have∑

l∈Ltπtl = 1

• ξtl, is the conditional expectation ofξt(ε[t]) given thatεt is equal toεtl:

ξtl = E[ξt(ε[t]) | εt = εtl

]= E

[ξt(ε[t−1], εt) | εt = εtl

]=

∑eε[t−1]∈ Sε[t−1]

ξt(ε[t−1], εtl)fε[t](ε[t−1], εtl)

fεt(εtl)

=∑k∈Kt

ξtklptkl

πtl

=∑k∈Kt

ξtkl

pa(tkl)πtl

πtl

=∑k∈Kt

ξtklpa(tkl),

wherefε[t] andfεt , are joint and marginal distributions, respectively.

We propose a new approximation to thePMLP based on the event wood. We will use the label ELP forthis model (Multi-stage Event Linear Programming). As we already said, labelstl andt correspond toleaves and root, respectively, of the event wood trees.

Definition 6 ELP objective function and domain

zELP (u, s, u, s) =∑

tl∈eTLt

πtl ct · utl +∑

tl∈TLt

πtl(c−t · s−tl + c+t · s+

tl ),

DELP = { (u, s, u, s) : s1 = s0

stl = At st−1 + Bt ut−1 + Ctl, tl ∈ TLt (13)

s+tl − s−tl = stl − St tl ∈ TLt, (14)

ut =∑l∈Lt

πtl utl t ∈ T (15)

st =∑l∈Lt

πtl stl t ∈ T (16)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (17)

whereCtl =∑

k∈Ktpa(tkl) Ctkl.

Definition 7 ELP problem

min zELP (u, s, u, s)s. t. (u, s, u, s) ∈ DELP

9

5 Lower bounds

In the stochastic linear programming context, it is well-known that, for the case of only a stochasticright-and-side, the deterministic model gives a cost lower bound for the stochastic optimal cost [5]. Inthe context of this paper it means:

z∗DLP ≤ z∗SLP ,

wherez∗P stands for the optimal cost of a given problemP .

In this section we will provez∗DLP ≤ z∗ELP ≤ z∗SLP .

With this objective in mind, we define the auxiliary problem:

Definition 8 Auxiliary objective function and domain

zA(u, s, u, s, u, s) = zSLP (u, s)DA = { (u, s, u, s, u, s) : s111 = s0 (18)

stl = At st−1 + Bt ut−1 + Ctl, tl ∈ TLt (19)


stl =∑k∈Kt

pa(tkl)stkl tl ∈ TLt, (21)

s+tl =

∑k∈Kt

pa(tkl)s+tkl tl ∈ TLt, (22)

s−tl =∑k∈Kt

pa(tkl)s−tkl tl ∈ TLt, (23)

st =∑

kl∈KtLt

ptklstkl t ∈ TLt, (24)

utl =∑k∈Kt

pa(tkl)utkl tl ∈ TLt, (25)

ut =∑

kl∈KtLt

ptklutkl t ∈ T , (26)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (27)

Definition 9 Auxiliary problemPA

min zA(u, s, u, s, u, s)s. t. (u, s, u, s, u, s) ∈ DA

Lemma 1 1.

If st =∑

kl∈KtLt

ptklstkl and stl =∑k∈Kt

pa(tkl)stkl, thenst =∑l∈Lt

πtlstl.

2.If ut =

∑kl∈KtLt

ptklutkl andutl =∑k∈Kt

pa(tkl)utkl, thenut =∑l∈Lt

πtlutl.

10

3. If (u, s, u, s, u, s) ∈ DA, then(u, s, u, s) ∈ DELP ,

4. If (u, s, u, s, u, s) ∈ DA, then

zA(u, s, u, s, u, s) = zSLP (u, s) = zELP (u, s, u, s).

Proof: The first statement can be proved as follows

st =∑

kl∈KtLt

ptklstkl =∑

kl∈KtLt

pa(tkl)πtlstkl

=∑l∈Lt

πtl

∑k∈Kt

pa(tkl)stkl

=∑l∈Lt

πtlstl.

The second statement can be proved in the same way.

The third statement, comes directly from the first and second statements.

In the fourth statement, the first equality is by definition. To prove the second equality we will use thatptkl = pa(tkl) πtl to rearrangezSLP :

zSLP (u, s) =∑

tkl∈eTKtLt

ptkl ct · utkl +∑

tkl∈TKtLt

ptkl(c−t · s−tkl + c+t · s+

tkl),

=∑

tl∈eTLt

πtlct ·

∑k∈Kt

pa(tkl) utkl

+

∑tl∈eTLt

πtlc−t ·

∑k∈Kt

pa(tkl) s−tkl

+∑

tl∈eTLt

πtlc+t ·

∑k∈Kt

pa(tkl) s+tkl

,

=∑

tl∈eTLt

πtlct · utl +∑

tl∈eTLt

πtlc−t · s−tl +

∑tl∈eTLt

πtlc+t · s+

tl

=∑

tl∈eTLt

πtlct · utl +∑

tl∈eTLt

πtl(c−t · s−tl + c+t · s+

tl )

= zELP (u, s, u, s).

Proposition 2 The optimal cost of the auxiliary problemPA gives a lower bound for the SLP optimalcost:

z∗A ≤ z∗SLP

Proof: Let us consider the SLP domain:

DSLP = { (u, s) : s111 = s0

stkl = Atsa(tkl) + Btua(tkl) + Ctkl, tkl ∈ TKtLt (28)

s+tkl − s−tkl = stkl − St tkl ∈ TKtLt, (29)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (30)

11

Step 1: InDSLP , for each pairtl ∈ TLt, we aggregate the constraints (28) that are indexed bytklwith k ∈ Kt and usepa(tkl) as aggregation weights:∑

k∈Kt

pa(tkl)stkl =∑k∈Kt

pa(tkl)

(Atsa(tkl) + Btua(tkl) + Ctkl

)Equivalently,

stl = Atst−1 + Btut−1 + Ctl

where

stl =∑k∈Kt

pa(tkl)stkl (31)

st =∑

kl∈KtLt

ptklstkl (32)

ut =∑k∈Kt

ptklutkl (33)

Ctl =∑k∈Kt

pa(tkl)Ctkl. (34)

Step 2: Repeat Step 1 for constraints (29)∑k∈Kt

pa(tkl)

(s+tkl − s−tkl

)=

∑k∈Kt

pa(tkl) (stkl − St) .

Considering that∑

k∈Ktpa(tkl) = 1, we can write the equivalent equality

s+tl − s−tl = stl − St,

where

s+tl =

∑k∈Kt

pa(tkl)s+tkl (35)

s−tl =∑k∈Kt

pa(tkl)s−tkl (36)

Step 3: For each SLP point(u, x) we can define the auxiliary point(u, s, u, s, u, s) wheres, s, u, s+, s−

are defined from(u, s) by equations (31-36), respectively andutl is defined as

utl =∑k∈Kt

pa(tkl)utkl.

By construction, if(u, x) ∈ DSLP then the associated auxiliary point(u, s, u, s, u, s) ∈ DA. Further-more, by definition,

zA(u, s, u, s, u, s) = zSLP (u, s).

This proves thatz∗A ≤ z∗SLP

Proposition 3 To solve the auxiliary problem is equivalent to solve the ELP problem. Furthermore,they have the same optimal value

z∗A = z∗ELP .

12

Proof: By Lemma 1.3 if(u0, s0, u0, s0, u0, s0) ∈ DA then(u0, s0, u0, s0) ∈ DELP . By Lemma 1.4

zELP (u0, s0, u0, s0) = zA(u0, s0, u0, s0, u0, s0).

This implies thatz∗ELP ≤ z∗A.

Second, let us see that if(u0, s0, u0, s0) ∈ DELP then there exists(u0, s0, u0, s0, u0, s0) ∈ DA suchthat

zA(u0, s0, u0, s0, u0, s0) = zELP (u0, s0, u0, s0).

Given that∑

k∈Ktpa(tkl) = 1 andpa(tkl) ≥ 0, there must existk such thatpa(tkl) > 0. We defineu0

tkl

as follows:

u0tkl =

1pa(tkl)

u0tl

andu0tkl = 0 for k 6= k. From this definitionu0 ≥ 0 and

u0tl =

∑k∈Kt

pa(tkl)u0tkl.

In the same way, we define

s0tkl =

1pa(tkl)

s0tl

ands0tkl = 0 for k 6= k. From this definitions0 ≥ 0 and

s0tl =

∑k∈Kt

pa(tkl)s0tkl.

Therefore, the point(u0, s0, u0, s0, u0, s0) thus defined is inDA and by Lemma 1.4

zA(u0, s0, u0, s0, u0, s0) = zELP (u0, s0, u0, s0).

This implies thatz∗A ≤ z∗ELP . All in all, z∗A = z∗ELP and once we have an optimal solution for theauxiliary problem we also have an optimal solution for the ELP problem. Otherwise said, to solvePA

is equivalent to solvePELP .

Theorem 1 The ELP solution value gives a lower bound for the SLP solution value:

z∗ELP ≤ z∗SLP

Proof: From propositions 2 and 3 we have

z∗ELP = z∗A ≤ z∗SLP .

Theorem 2 The DLP solution value gives a lower bound for the ELP solution value:

z∗DLP ≤ z∗ELP

13

Proof: Let us consider the ELP domain:

DELP = { (u, s, u, s) : s1 = s0

stl = Atst−1 + Btut−1 + Ctl, tl ∈ TLt (37)


ut =∑l∈Lt

πtl utl t ∈ T (39)

st =∑l∈Lt

πtl stl t ∈ T (40)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (41)

Step 1: InDELP , aggregate the constraints (37). Useπtl as aggregation weights.:∑l∈Lt

πtlstl =∑l∈Lt

πtl

(Atst−1 + Btut−1 + Ctl

)Considering that

∑l∈Lt

πtl = 1, we can write the equivalent equality

st = Atst−1 + Btut−1 + Ct.

Step 2: InDELP , aggregate the constraints (38). Useπtl as aggregation weights:∑l∈Lt

πtl

(s+tl − s−tl

)=

∑l∈Lt

πtl (stl − St)

Equivalently,s+tl − s−tl = stl − St.

Step 3: We define the following auxiliary objective function and domain

zB(u, s, u, s) = zELP (u, s, u, s) (42)

DB = { (u, s, u, s) : s11 = s0

st = Atst−1 + Btut−1 + Ct, t ∈ T (43)

s+tl − s−tl = stl − St t ∈ T , (44)

s+t =

∑l∈Lt

πtls+tl t ∈ T , (45)

s−t =∑l∈Lt

πtls−tl t ∈ T , (46)

st =∑l∈Lt

πtlstl t ∈ T , (47)

ut =∑l∈Lt

πtlutl t ∈ T , (48)

u ≥ 0, s+ ≥ 0, s− ≥ 0 } (49)

We also consider problemPB

min zB(u, s, u, s)s. t. (u, s, u, s) ∈ DB

14

First, to prove thatz∗B ≤ z∗ELP we could proceed in a similar way as in Proposition 2 to provez∗A ≤ z∗SLP .

Second, to prove thatz∗DLP = z∗B we could proceed in a similar way like in Proposition 3 to provez∗ELP = z∗A.

6 Preliminary computational experience

So far we have introduced the event wood approach and have given some theoretical properties. Nowwe wish to show its potential usefulness to deal with uncertainty whenever the SLP model becomestoo large. Programs have been written in Matlab 7.0 and run in a PC (Pentium-IV, 3.0 GHz, with 3GB of RAM memory) under the Windows XP operating system. To solve the LP’s associated to thesemodels we have intensively used Cplex 9.1 (default settings) interfaced with Matlab [2].

6.1 The news vendor problem

For this preliminary computational experience we use a multi-stage version of the news vendor pro-blem [29].

The DLP version of this problem is as follows:

mins, u

∑t∈eT

ctut +∑t∈T

(c−t s−t + c+

t s+t

)s. t. st = st−1 + ut−1 − ξt, t ∈ T ,

s+t − s−t = st t ∈ T

s1 = s01,

ut ≥ 0, s+t ≥ 0, s−t ≥ 0, t ∈ T .

In this model,ut is the order at staget and ξt is the demand. The inventory level isst: a positivevalue means a physical inventory, and a negative one a shortage. Initial inventory is fixed ats0

1 ≥ 0.Unsatisfied demand is backlogged. The parameterc+

t is the unit holding cost, andc−t the unit shortagecost. The purchase cost isct.

In the SLP version of this problem, stochastic demandsξt are considered. Note that, we will useξt

to denote the random variable andξt a realization ofξt. As in [3], we assume that the demands forma conditionally heteroskedastic Gaussian sequence with unconditional mean and varianceµt andσt.That is,ξt follows a normal distribution,ξt ∼ N(µt, σt), for t ∈ T .

The demand time series{ξt} is based on theindependent Gaussian white noise{εt} with σεt = 1. Att = 2 we have

ξ2 = µ2 + σ2ε2 (50)

and fort = 2, . . . , T − 1 the conditional demands are

ξt+1 | ξt = µt+1 + ρtσt+1

σt(ξt − µt) + σt+1

√1− ρ2

t εt+1. (51)

Therefore,ξt+1 | ξt follows a normal distribution

ξt+1 | ξt ∼ N(µt+1 | ξt, σt+1 | ξt

)for t ∈ T ,

15

with the conditional mean

µt+1 | ξt = E[ξt+1 | ξt] = µt+1 + ρtσt+1

σt(ξt − µt)

and the conditional variance

σ2t+1 | ξt = Var[ξt+1 | ξt] = σ2

t+1(1− ρ2t ),

whereρt is the correlation coefficient between the successive demandsξt andξt+1.

To solve the SLP model we have to construct the associated (independent white noise) scenario tree.Our construction of the scenario tree is based on the discretization{et} of the (continuous) Gaussianwhite noise{εt}. We assume that{et} inherits the independent and identically distributed property of{εt}.

First, we determine the support ofet, which is the set{etl | l ∈ Lt} with Lt = {1, . . . ,Lt}. Conside-ring thatεt ∼ N(0, 1), we partition the interval[−4,+4] into Lt contiguous intervalsItl = [αtl, βtl]of length

δt =8Lt

.

Each interval is represented by its middle pointetl for all l ∈ Lt. In the independent white noisescenario tree the error associated to the nodentkl is equal toetl for all k ∈ Kt.

Second, we compute the probability function of the discrete random variableet, which assigns proba-bility πtl to etl as follows

πtl = Φ(βtl)− Φ(αtl),

whereΦ(·) is the cumulative standard normal distribution.

Third, we complete the scenario tree by discretizing{ξt}. We assign to each node of the scenariotree the valueξtkl by using the recursive formulas (50-51) and the error valueetl. Furthermore, byProposition 1 we knowqtkl = πtl and thatptkl = pa(tkl)πtl.

6.2 Test 1: capacity to deal with uncertainty

Now we wish to compare the capacity to deal with uncertainty of the approaches DLP and ELP com-pared to the SLP approach. For this comparison we will use three criteria:

• The (optimal)expected costsz∗DLP andz∗ELP as lower bounds of thebest possibleexpected costz∗SLP .

• The (optimal)first stage decisinu∗1 given by the DLP model and by the ELP model, respectively.As it was suggested in [33], we will compare them to anoptimalfirst stage decision (which isgiven by the SLP model).

• Following [6], we also give theexpected result of using the DLP solution. We call it zDLP . Tocompute this expected result, in the SLP model, we fix the first stage decision to the optimal firststage decision given by the DLP model, and then we solve to optimality the resulting problem.Analogously, we computezELP , expected result of using the ELP solution.

The parameters that we have used are:

• T = 5,

16

• µt = 100 + 5(t− 1).

• at = 1 for t ∈ T .

• ht = 0.1 for t ∈ T .

• s01 = 0

• ρt = 0 or 0.5 for all t ∈ T

• Lt = 10 for all t ∈ T .

In this model, the critical parameters areσt andst. High values ofσt stress the uncertainty in theproblem, with wider demand ranges. High values of the shortage costst will produce high expectedcosts if the model is not well suited to cope with uncertainty. To see the sensitivity respect toσ ands, we solve the news vendor problem for(σ, s) = (1, 1), (10, 10) and(20, 20), first for ρ = 0.5 andsecond forρ = 0. The expected costsz∗, the expected resultz of using the DLP or the ELP solutionand the first stage decisionsu∗1, are displayed in Table 1.

In Table 1 we observe that, in this test:

• The lower bound to the SLP optimal cost given by the ELP model is much tighter that the DLPbound.

• The expected result of using the DLP solution can be very bad in the case of high uncertaintyand high shortage cost.

• In this test, the first stage decisions are the same for models SLP and ELP. Therefore the expectedresult of using the the ELP solution corresponds to the optimal SLP cost.

6.3 Test 2: Tractability

We briefly comment on the tractability of the tree models. In particular we are interested on the impactthat the number of stagesT has on each model. We study the dimensions of the constraint matrixA,associated to the news vendor problem formulated as a standard LP. The determinant parameters forthese dimensions are: the number of stagesT and the number of sons per nodeLt, with t ∈ T . Inthis test, we consider 10 sons per node in all the stages, i.e.,Lt = L = 10. In Table 2, we report thedimensionsm×n of A, wherem is the number of constraints andn is the number of variables. In thistable, we observe that these dimensions grow, as expected, exponentiallyo(LT ) and linearlyo(T ), forthe SLP and DLP models, respectively. In the ELP model, these dimensions also grow linearly, butwith ordero(L × T ).

We have also studied the CPU time required to solve the news vendor problem by the tree approaches.We solved the instance defined by parameters(ρ, σ, s) = (0.5, 1, 1) in Table 1, for different number ofstages (T = 3, . . . , 7). The reported CPU time accounts for the time needed to set up the model plusthe time that Cplex took to solve the LP instance. In the case of formulations DLP and ELP, this CPUtime was under 1 second in all the cases (we wroteε in Table 2). The CPU time required for the SLPmodel was relatively much larger for casesT ≥ 5. Furthermore, forT = 7 we could not solve theSLP formulation within 50,000 seconds. For this case the required memory was 650 MB.

Finally, it is worth to mention that the ELP model, by relaxing constraints (15-16), nicely decomposesinto T − 1 two-stage stochastic LP problems.

17

Table 1: Influence of the shortage costs and the standard deviationσ: In this table we have the expected costz∗, the expected resultz ofusing the DLP or the ELP solution and the first stage decisionu∗1, for(σ, s) = (1, 1), (10, 10) and(20, 20), correlated (ρ = 0.5) and uncorrelated(ρ = 0) cases. We also give the variation respect to column SLP (in %).

SLP DLP (%) ELP (%)z∗ z∗ z∗

σ/s/ρ z z zu∗1 u∗1 u∗1

450.4545 450.0000 -0.10 450.4543 -0.000041/ 1 / 0.5 450.4545 450.7408 0.06 450.4545 0

106.2000 105.0000 -1.13 106.2000 0473.2653 450.0000 -4.92 473.2653 -0.00000

10/ 10 / 0.5 473.2653 513.0041 8.40 473.2653 0125.0000 105.0000 -16.00 125.0000 0506.4057 450.0000 -11.14 506.4053 -0.00008

20/ 20 / 0.5 506.4057 669.5606 32.22 506.4057 0161.0000 105.0000 -34.78 161.0000 0

450.5018 450.0000 -0.11 450.5016 -0.000041/ 1 / 0. 450.5018 450.7882 0.06 450.5018 0

106.2000 105.0000 -1.13 106.2000 0476.4509 450.0000 -5.55 476.4508 -0.00002

10/ 10 / 0. 476.4509 516.1896 8.34 476.4509 0125.0000 105.0000 -16.00 125.0000 0514.2410 450.0000 -12.49 514.2409 -0.00002

20/ 20 / 0. 514.2410 677.3967 31.73 514.2410 0161.0000 105.0000 -34.78 161.0000 0

Table 2:Size of the LP constraint matrix (m× n), number of scenarios (events) andCPU time in seconds. The symbolε means that the CPU time was under 1 second.

SLP DLP ELPT m n Scenarios CPU m n CPU m n Events CPU3 111 333 100 ε 3 9 ε 27 69 20 ε4 1,111 3,333 1,000 ε 4 12 ε 40 102 30 ε5 11,111 33,333 10,000 15 5 15 ε 53 135 40 ε6 111,111 333,333 100,000 17606 18 ε 66 168 50 37 1,111,111 3,333,333 1,000,000> 50,000 7 21 ε 79 201 60 30

18

7 Conclusions

The contribution of this paper has been in three aspects. First, we have introduced the concept ofindependent white noise scenario tree. This type of tree allows for a very convenient way to link thetime series theory with the scenario tree approach in multi-stage stochastic linear programming (SLP).

Second, we have introduced the multi-stage event linear programming (ELP) model, which is based onthe so-calledevent woodinstead of the scenario tree. In this new approach, the modeling of uncertaintyis simpler than in the SLP approach. It is not intended to replace the SLP model, but to be used inthe cases where the SLP model becomes intractable. Of course, in the ELP approach, the modeling ofuncertainty is much richer than in the deterministic approach (DLP).

Third, we have proved that in a quite general class of multi-stage LP problems, with uncertainty only inthe right-hand-side, the ELP model gives a lower bound to the optimal SLP cost. We have also provedthat this bound is tighter that the bound given by the DLP model. In the preliminary computationaltest, carried out with a multistage version of the news vendor problem, the ELP bounds have been verytight.

As a future work, we are planning to enhance the computational experience with the ELP approach,as well as, to extend it to treat stochastic mixed integer LP problems by using the branch-and-fixcoordination concept introduced in [1].

AcknowledgmentsWe are thankful to Jean-Philippe Vial and Alain Haurie (University of Geneva,Switzerland) for their comments and support at Logilab. We also thank the support of the grantS2009/esp-1594 (Riesgos CM) from the Comunidad de Madrid (Spain) and the grants MTM2009-14039-C06-03 and MTM2009-14087-C04-01 from the Spanish Science and Innovation Ministry.

References

[1] A. Alonso-Ayuso, L.F. Escudero, and M.T. Ortuno. BFC, a branch-and-fix coordination algo-rithmic framework for solving some types of stochastic pure and mixed 0-1 programs.EuropeanJournal of Operational Research, 151:503–519, 2003.

[2] M. Baotic. Matlab interface for cplex, http://control.ee.ethz.ch/ hybrid/cplexint.php 2004.

[3] D. Barnes-Schuster, Y. Bassok, and R. Anupindi. Coordination and flexibility in supply contractswith options.Manufacturing and service operations management, 4(3):171–207, 2002.

[4] A. Berkelaar, J. A. S. Gromicho, R. Kouwenberg, and S. Zhang. A primal-dual decompositionalgorithm for multistage stochastic convex programming.Mathematical Programming, 104:153–177, 2005.

[5] J. R. Birge. Aggregation bounds in stochastic linear programming.Mathematical Programming,31:25–41, 1985.

[6] J. R. Birge and F. Louveaux.Introduction to stochastic programming. Springer, 1997.

[7] J. Blomvall and A. Shapiro. Solving multistage asset investment problems by the sample averageapproximation method.Mathematical Programming, 108:571–595, 2006.

[8] M. S. Casey and S. Sen. The scenario generation algorithm for multistage stochastic linearprogramming.Mathematics of Operations Research, 30:615–631, 2005.

[9] M. Davidson. Primal-dual constraint aggregation with application to stochastic programming.Annals of Operations Research, 99:41–58, 2000.

19

[10] L.F. Escudero, J.L. de la Fuente, C. Garcia, and F.J. Prieto. A parallel computation approach forsolving multistage stochastic network problems.Annals of Operations Research, 90:131–160,1999.

[11] E. Fragniere, J. Gondzio, and J-Ph. Vial. Building and solving large-scale stochastic programs onan affordable distributed computing system.Annals of Operations Research, 99:167–187, 2000.

[12] K. Frauendorfer. Barycentric scenario trees in convex multistage stochastic programming.Mat-hematical Programming, 75:277–293, 1995.

[13] J. Gondzio, R. Sarkissian, and J.-Ph. Vial. Parallel implementation of a central decompositionmethod for solving large-scale planning problems.Computational Optimization and Applica-tions, 19:5–29, 2001.

[14] H. Heitsch and W. Roemisch. Scenario tree modeling for multistage stochastic programs.Mat-hematical Programming, 118:371–406, 2009.

[15] H. Heitsch and W. Roemisch. Scenario tree reduction for multistage stochastic programs.Compu-tational Management Science, 6(2):117–133, 2009.

[16] P. Kall and J. Mayer.Stochastic Linear Programming. Kluwer academic publishers, 2005.

[17] R. Kouwenberg. Scenario generation and stochastic programming models for asset liability ma-nagement.European Journal of Operational Research, 134:279–292, 2001.

[18] D. Kuhn. Aggregation and discretization in multistage stochastic programming.MathematicalProgramming, 113:61–94, 2008.

[19] D. Kuhn, P. Parpas, and B. Rustem. Bound-based decision rules in multistage stochastic pro-gramming.Kybernetika, 44(2):134–150, 2008.

[20] S. C. H. Leung, S. O. S. Tsang, W. L. Ng, and Y. Wu. A robust optimization model for multi-site production planning problem in an uncertain environment.European Journal of OperationalResearch, 181:224–238, 2007.

[21] X. Liu and J. Sun. A new decomposition technique in solving multistage stochastic linear pro-grams by infeasible interior point methods.Journal of Global Optimization, 28:197–215, 2004.

[22] J. M. Mulvey, R. J. Vanderbei, and S.A. Zenios. Robust optimization of large-scale systems.Operations Research, 43:264–281, 1995.

[23] M. V. F. Pereira and L. M. V. G. Pinto. Multi-stage stochastic optimization applied to energyplanning.Mathematical Programming, 52:359–375, 1991.

[24] A. Rong and R. Lahdelma. Fuzzy chance constrained linear programming model for optimizingthe scrap charge in steel production.European Journal of Operational Research, 186:953–964,2008.

[25] A. Ruszczynski and A. Shapiro. Some advances in decomposition methods for stochastic linearprogramming.Annals of Operations Research, 85:153–172, 1999.

[26] R. Schultz. Stochastic programming with integer variables.Mathematical Programming,97:285–309, 2003.

[27] A. Shapiro. Inference of statistical bounds for multistage stochastic programming problems.Mathematical methods of operations research, 58:57–68, 2003.

20

[28] A. Shapiro. On complexity of multistage stochastic programs.Operations Research Letters,34:1–8, 2006.

[29] A. Shapiro, D. Dentcheva, and A. Ruszczynski.Lectures on stochastic programming - modelingand theory. MPS-SIAM, 2009.

[30] R. H. Shumway and D. S. Stoffer.Time series analysis and its applications. Springer, 2000.

[31] J. Sun and X. Liu. Scenario formulation of stochastic linear programs and the homogeneousself-dual interior-point method.INFORMS Journal on Computing, 18:444–454, 2006.

[32] J. Thenie and J.-Ph. Vial. Step decision rules for multistage stochastic programming: a heuristicapproach.Automatica, 44:1569–1584, 2008.

[33] R. J.-B. Wets. Challenges in stochastic programming.Mathematical Programming, 75:115–135,1996.

[34] S. E. Wright. Primal-dual aggregation and disaggregation for stochastic linear programming.Mathematics of Operations Research, 19(4):893–908, 1994.

21

multi-stage stochastic linear programming: …multi-stage stochastic linear programming: scenarios...

Documents