=1=models and algorithms for stochastic...
TRANSCRIPT
Models and Algorithms for StochasticProgramming
Jeff Linderoth
Dept. of Industrial and Systems EngineeringUniv. of Wisconsin-Madison
Enterprise-Wide Optimization MeetingCarnegie-Mellon University
March 10th, 2009
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 1 / 82
Mission Impossible
ExplainingStochasticProgramming in90 mins
I will try to givean overview –please interruptwith questions!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 2 / 82
What I’ll Ramble On
Models
How to deal with uncertainty
Why modeling uncertainty is important
Who has used stochastic programming?
Why more people don’t use stochastic programming
Algorithms
Extensive Form
Benders Decomposition (2-stage)
Sampling
Nested Benders Decomposition (multistage)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 3 / 82
Dealing with Uncertainty Definition of Stochastic Programming
Etymology
program:
(3) An ordered list of events to take place or procedures to be followed; ascheduleLate Latin programma, public notice, from Greek programma, programmat-, from
prographein, to write publicly
stochastic:
(1b) Involving chance or probabilityGreek stokhastikos, from stokhasts, diviner, from stokhazesthai, to guess at, from
stokhos, aim, goal.
Source: The American Heritage Dictionary of the English Language, Fourth
Edition.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 4 / 82
Dealing with Uncertainty Sources of Uncertainty
Sources of Uncertainty
Houston, we have uncertainty!
What we anticipate seldom occurs; what we least expectedgenerally happens.
Benjamin Disraeli (1804 - 1881)
Financial
Market price movementsDefaults by a business partner
Operational
Customer demands,Travel times
Technology related
Will a new technology beready “in time”
Market Related
Shifts in tastes
Competition
What will your competitorsstrategy be next year?
Acts of God: Jeff’s travelexperience yesterday!!!
WeatherEquipment failureBirds flying into planes
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 5 / 82
Dealing with Uncertainty Sources of Uncertainty
Stochastic Programming
A tool used in planning under uncertainty
More specifically: Mathematical Programming, or Optimization, inwhich some of the parameters defining a problem instance arerandom, or uncertain
Optimization
minx∈X
f(x)
x: Variables you control
Stochastic Optimization
minx∈X(ω)
F(x,ω)
ω: Variables you don’t control
Stochastic Optimization is UNDEFINED
You can’t possibly choose an x that optimizes for all ω
More specifiation is required
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 6 / 82
Dealing with Uncertainty Sources of Uncertainty
Jeff’s Stochastic Programming Assumptions
In stochastic programming, we assume that a probability distributionfor the uncertainty ω is known or can be approximated.
We also assume that probabilities are independent of the decisionsthat are taken.
Decision-dependent uncertainty
Decisions influence probabilitydistributions
Decisions influence knowledgediscovery
Want to know about stochasticprogramming withdecision-dependent uncertainty?
Talk to Ignacio!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 7 / 82
Dealing with Uncertainty Sources of Uncertainty
Probability Theory(?)
This notion of having to know a probability distribution for therandomness is troubling, since in reality, very few people exactly knowthat
Their customer demands follow a log-normal distribution with mean17.26 and variance 2.88726Their plant will have forced shutdowns following a Weibull distributionwith parameters (100.25, 73.7916)
Instead, you might be able to
Estimate distributions from historial data (be careful!)Have “qualitative” probability measures (“low/medium/high”)Create your own scenarios of interest
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 8 / 82
Dealing with Uncertainty Sources of Uncertainty
The Journey is the Reward?
Business process people can argue/discuss amongst themselves whatthe various scenarios might be and the outcomes of those scenarios.
This process by itself can be very useful
There is a good amount of frightening-looking mathematical theoryand computational evidence that solutions obtained from stochasticprograms are often quite “stable” with respect to changes in theinput probability distribution
The Upshot
It doesn’t matter ”too much” if your numbers aren’t quite right
The insights you gain from considering the uncertainty can still bevaluable
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 9 / 82
Dealing with Uncertainty Related Decision Making Technologies
A Concrete Example: An Uncertain LP
min cx
s.t. Ax ≥ b
T(ω)x ≥ h(ω)
x ≥ 0
T(ω) and h(ω) are uncertain: X(ω) = x | Ax ≥ b, T(ω)x ≥ h(ω)
We must choose x despite this uncertaintyExamples:
Decide production quantities before knowing demandsConstraint data includes imprecise measurements
Three Approaches
1 Robust optimization
2 Chance-constrained programming
3 Recourse-based stochastic programming
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 10 / 82
Dealing with Uncertainty Related Decision Making Technologies
Robust Optimization
Uncertain data is assumed to lie in an uncertainty set
(T(ω), h(ω)) ∈ U
Guarantee that constraints be satisfied for all possible realizations
min cx
s.t. Ax ≥ b
Tx ≥ h ∀(T, h) ∈ Ux ≥ 0
Tractability depends on structure of U
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 11 / 82
Dealing with Uncertainty Related Decision Making Technologies
Robust Optimization
To control conservatism, uncertainty set can be parameterized by abudget of uncertainty
Example 1: Tij(ω) ∈ [lij, uij] (Bertsimas and Sim)
At most K of the components in each row can differ from the nominalvalueNature can choose which K will differK large ⇒ highly conservative (Soyster)K = 0 ⇒ No robustnessCan formulate this problem as a linear program
Example 2: U is ellipsoidal (Ben-Tal and Nemirovski)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 12 / 82
Dealing with Uncertainty Related Decision Making Technologies
Robust Optimization
Advantages:
Computationally tractable
Can yield extremely reliable solutions
Does not require stochastic model
Disadvantages:
Does not use a stochastic model
Although conservatism can be controlled, the control parameterdoesn’t have meaning to decision makers
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 13 / 82
Dealing with Uncertainty Related Decision Making Technologies
Stochastic Programming
Assume uncertain data are random variables with known distributions
Two approaches to uncertain constraints:
1 Require constraint to be satisfied with high probability
min cx : x ∈ X, PT(ω)x ≥ h(ω) ≥ 1 − ε
ε is a parameter, e.g. ε = 0.05 or ε = 0.01
Linear program with probabilistic (chance) constraints
2 Penalize violations of constraints
mincx + E[λ(h(ω) − T(ω)x)+] : x ∈ X
Special case of a Two stage stochastic program
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 14 / 82
Dealing with Uncertainty Related Decision Making Technologies
Linear Programs with Probabilistic Constraints
Individual constraints:
min
cx : x ∈ X, PT(ω)ix ≥ h(ω)i ≥ 1 − εi ∀i
Joint constraints:
min cx : x ∈ X, PT(ω)x ≥ h(ω) ≥ 1 − ε
Bad news: calculating probability is hard
Worse news: probabilistic constraints are generally non-convex!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 15 / 82
Dealing with Uncertainty Related Decision Making Technologies
Non-convexity of the feasible region
Consider: Px1 ≥ ξ1, x2 ≥ ξ2 ≥ 0.6
Each dot: a realization of ξ which occurs with probability 1/10
x2
x1
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 16 / 82
Dealing with Uncertainty Related Decision Making Technologies
Two Stage Stochastic Programming
(SP) mincx + E[λ(h(ω) − T(ω)x)+] : x ∈ X
Choose x ⇒ Observe (T(ω), h(ω)) ⇒ Pay penalty
Good news: (SP) is convex
Bad news: Calculating expectation is hard
Successful Approach: Sample Average Approximation
Generate (T(ω)1, h(ω)1), . . . , (T(ω)N, h(ω)N) and solve
(SPN) min
cx +
N∑i=1
1
Nλ(h(ω)i − T(ω)ix)+ : x ∈ X
x∗N is a often a good approximation to true optimal solution
We’ll see (a lot) more later!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 17 / 82
Dealing with Uncertainty Related Decision Making Technologies
Stochastic Programming vs. Simulation
Simulation
(Pro): Very flexible—System need not be mathematically defined(Pro): Fast(Con): If I run 100 “what-ifs” and get 100 different solutions, howdoes simulation help me plan for the future?
Stochastic Programming
(Con): More challenging to build and solve models(Pro): SP helps you “optimize” over your “what-ifs”.
The Upshot!
Use simulation to generate scenarios. Input the scenarios to a stochas-tic program to show how to decide how to best hedge against thisuncertainty
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 18 / 82
Dealing with Uncertainty Related Decision Making Technologies
Multistage Decision Making
ω1
x1
ω2
x2
ω3
xT−1
ωT
xT
Random vectorsω1 ∈ Rn1 ,ω2 ∈Rn2 , . . . , ωT ∈ RnT
Make sequence ofdecisions x1 ∈ X1, x2 ∈X2, . . . , xT ∈ XT .
The evolution of information is of fundamental importance to thedecision-making progress.
We make a decision now (x1)
Nature makes a random decision ω2: (“stuff” happens)
We make a second period decision x2 that attempts to repair thehavoc wrought by nature in (recourse).
Repeat as necessary...
We make decisions in stages, in between which uncertainty is revealedto us
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 19 / 82
Why use Stochastic Programming The Newsvendor
Hot Off the Presses
A paperboy (newsvendor) needs to decide how many papers to buy inorder to maximize his profit.
He doesn’t know at the beginning of the day how many papers he cansell (his demand).
Each newspaper costs c.He can sell each newspaper for a price of q.He can return each unsold newspaper at the end of the day for r.(Obviously r < c < q).
The Newsvendor Problem
Given only knowledge of the probability distribution F of demand,how may papers should the newsvendor buy?
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 20 / 82
Why use Stochastic Programming The Newsvendor
Newsvendor Problem
Suppose that the newsvendor’s goal is to maximize the profits in thelong run. (In expectation)...
Intuitively, it seems that the newsvendor’s best strategy is to everypurchase the average demand
Take Away Message!
The “optimal” solution is NOT to use the mean demand.
In fact, the two solution can be far apart. (Depending on thedistribution, and parameters r, c, q
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 21 / 82
Why use Stochastic Programming The Newsvendor
Example—The Newsvendor
c = 50, q = 70, r = 5
Demand: (Truncated) Normal distributed. µ = 100, σ = 50
Mean Value Solution
Buy 100. (Duh!)Expect to profit: 2000TRUE long run profit ≈ 650
Stochastic Solution
Buy 75.Expect to profit: 1500TRUE long run profit ≈ 880
The difference between the two solutions (880 − 650) is called thevalue of the stochastic solution.
How much is it worth to you to plan using full uncertainty informationas opposed to mean-values for the uncertain parameters
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 22 / 82
Why use Stochastic Programming The Newsvendor
A Take Away Message
The “Flaw” of Averages
The flaw of averages occurs when uncertainties are replaced bysingle average numbers planning.
Did you hear the one about the statistician who drowned fording ariver with an average depth of three feet.
Point Estimates
If you are planning with point estimates for demands, then you areplanning sub-optimally
It doesn’t matter how carefully you choose the point estimate – itis impossible to hedge against future uncertainty by consideringone realization of the uncertainty in your planning process
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 23 / 82
Stochastic Programming Success Stories Financial Optimization
Russell-Yasuda Kasai
Yasuda Kasai: Seventh largest (worldwide) property and casualtyinsurer.
Assets of > U3.47 trillion
Liability structure is complex, but want a tool that will allow them tomaximize the revenue from these assets in the face of assetmanagement restrictions
Frank Russell Company hired to develop Asset-Liability ManagementModel based on (multistage) stochastic programming
Carino, Myers, Ziemba, Second place in Edelman prize competition ofINFORMS.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 24 / 82
Stochastic Programming Success Stories Financial Optimization
Asset Allocation Model
Decisions:Investment amounts for various assets
Random Events:Return on investment for each asset.Liability payouts
Constraints:Asset Allocation Constraints (Complex)Loan ModelLiability Model
Compared to a performance benchmark established at YasudaKasai at the beginning of the Fiscal Year to measure the valueadded by their use of the model, the new model increased annualincome by U9.5 billion.
Mr. Kunihiko Sasamoto, Director and Deputy President, Yasuda Kasai.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 25 / 82
Stochastic Programming Success Stories Financial Optimization
But Wait There’s More!
Ease of Use
Risk is well defined, not using some “abstract” measure like standarddeviation
Improved other systems
Other models and IT systems “upgraded” to support new system
Improved Human Judgement
How to think about and incorporate uncertainty into the planningprocess
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 26 / 82
Stochastic Programming Success Stories Financial Optimization
Product Portfolio Planning
Decisions:
Invest in various projects (All or nothing investment).Complicated project prerequisite structure
Random Events: (HUGE impact)Design-win from customersTechnology failuresMarket forces
Constraints:
ResourcesHire-fire costs
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 27 / 82
Stochastic Programming Success Stories Financial Optimization
Product Portfolio Management at Agere
We implemented a decision support tool for Agere
1 Optimization Model
2 Simulator of future conditions – (random events were correlated!)
The muckety-mucks loved it!
They like the ability to talk about the different scenarios.
Focuses discussion in business planning meetings
Gives “unbiased” simulator view of potential outcomes of decisions
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 28 / 82
Stochastic Programming Success Stories Logistics
SP in the Supply Chain
Decisions:
Regular supply chain decision: How much? where? and when?
Random Elements:
Demands, prices, resource capacity.Supply chains going global imply that companies are now more exposedto risky factors such as exchange rates and reliability of transferchannels.
Constraints:
Regular supply chain constraints: Flow balance, material availability,etc.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 29 / 82
Stochastic Programming Success Stories Logistics
A Case Study
T. Santoso, S. Ahmed, M. Goetschalckx, and A. Shapiro. ”A Stochastic ProgrammingApproach for Supply Chain Network Design under Uncertainty,” European Journal ofOperational Research, vol.167, pp.96-115, 2005.
Two real supply chains
One Domestic (Cardboard packages to breweries and soft drinkmanufacturers...)One global
Sizes: Around 100 facilities. Around 100 customers,
In general, the (sampled) stochastic model was roughly 5% betterthan using the “mean value” of demand, translating into millions ofdollars in potential savings.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 30 / 82
Stochastic Programming Success Stories Logistics
Supply Chain Projects
Bulk Gas Production and Distribution
Uncertainty in customer demands,“competitor drain”Built (prototype) optimization modeland simulator.They are now(?) doing a realimplementation
Lesson Learned
Having a (static) simulation of the production-disribution process is akey component to the project
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 31 / 82
Stochastic Programming Success Stories Other Industries
Other Industrial Applications of SP
Energy Industry
Unit Commitment Problem: Schedule production from powergeneration units
Telecommunication
Capacity/bandwidth planning: Invest in capacity for the network beforeyou know the true bandwidth demands
Military
Network Interdiction Problem: Where to place “agent” on a network to“interrupt” evil-doers
It ain’t that rosy
As far as I know, mot implementations are built on a case-by-case basisand are fairly ad-hoc.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 32 / 82
Why More People Don’t Use SP
Stochastic Programming Objectives—Risk Profile
What is your goal?
1 I want to do well on average
Expected Value
2 I want to limit my exposure in the “worst” case or cases
Value at Risk/Conditional Value at Risk
3 I want the probability that I achieve a goal to be sufficiently high?
Chance constraints
4 I want to achieve a “steady” return?
Dispersion-based objectives
Each of these imply a different notion of risk, and lead to differentstochastic optimization problems
Stochastic Programming isn’t about getting a number, it’s aboutgetting a distribution that looks good to you
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 33 / 82
Why More People Don’t Use SP
Some SP Objectives
min F(x, ω) Mean-Value Problem
min EωF(x,ω) Risk Neutral
min EωF(x,ω) − λρ(F(x,ω)) Risk Measures
ρ(F(x,ω)) = VarF(x,ω) Markowitzρ(F(x,ω)) = E [(EF(x,ω) − F(x,ω))+] Semideviation
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 34 / 82
Why More People Don’t Use SP
Things People Want
Arbi
trary
Distribu
tions
(Conditional) Value at Risk
Network
Problem
s
Scenario Trees Stochastic Dynamic Programming
Robust Optimization
(Joint)Ch
ance
Constra
ints
Stochastic Dominance
Stochastic Control
Joint Distributions
Nonlinear problems
Int
eger pro
blems
Free Beer
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 35 / 82
Why More People Don’t Use SP
Supporting Stochastic Programs
I point out all these different flavors of SP to highlight what I thinkhas been one of the hinderances of having a modeling laguage for SP.
I don’t know the key to success, but the key to failure is trying toplease everybody.
Bill Cosby (1937 - )
I believe the fact that a “stochastic program” is not a well-definedconcept is one of the fundamental reasons why more people don’t usestochastic programming
Other reasons people don’t use stochastic programming?
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 36 / 82
Why More People Don’t Use SP
Why Don’t More People Use Stochastic Programming
They don’t start their training early enough!
Jacob Linderoth, age 4 months, reading Introduction to StochasticProgramming
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 37 / 82
Why More People Don’t Use SP Barriers to Stochastic Programming
Why Don’t More People Use Stochastic Programming
Because they don’t know the probability distribution?
Even crude approximations can help
Because they can’t “solve” them?
Linderoth and Wright solve a 10-million scenario problemRecent theory suggests that you don’t need to include many scenariosto get an accurate solution to the true problem
Because they can’t model them?
Modeling tools are on the way (more later)
Because it is hard to verify that the solution is better
The same could be said of Deterministic OptimizationUse simulation to verify that the solution is better
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 38 / 82
Why More People Don’t Use SP Barriers to Stochastic Programming
Probability Management
A “true believer” is Sam Savage (consulting professor at at Stanford).
He believes companies should have a comprehensive probabilitymanagement plan.
Probability Management
Simulations to generate distributions
Information systems to hold distributions of key uncertain inputs
A “Chief probability officer” responsible for signing off on thedistributions
You can start small...
1 What are your scenarios and distributions?
2 Do you have models that can use this information?
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 39 / 82
Algorithms
ALGORITHMS
I focus almost exclusively on two-stage recourse problems
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 40 / 82
Algorithms Two-Stage Stochastic Programs with Recourse
Stochastic ProgrammingA Stochastic Program
minx∈X
EωF(x,ω)
2 Stage Stochastic LP w/Recourse
F(x,ω)def= cTx + Q(x,ω)
cTx: Pay me now
Q(x,ω): Pay me later
The Recourse Problem
Q(x,ω)def= minqTy
Wy = h(ω) − T(ω)x
y ≥ 0
Expected Recourse Function:
Q(x)def= Eω[Q(x,ω)]
Two-Stage Stochastic LP
minx≥0,Ax=b
cTx +Q(x)Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 41 / 82
Algorithms Extensive Form
Extensive Form
Assume Ω = ω1,ω2, . . .ωS ⊆ Rr,P(ω = ωs) = ps,∀s = 1, 2, . . . , S
Ts ≡ T(ωs), hs = h(ωs)
Then can write extensive form:
cTx + p1qTy1 + p2qTy2 + · · · + psqTys
s.t.Ax = b
T1x + Wy1 = h1
T2x + Wy2 = h2
... +. . .
...TSx + Wys = hs
x ∈ X y1 ∈ Y y2 ∈ Y ys ∈ Y
The Upshot!
This is just a larger linear program
It is a larger linear program that also has special structure
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 42 / 82
Algorithms Extensive Form
Best-Known Solution Procedure
METH O D
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 43 / 82
Algorithms Extensive Form
Small SP’s are Easy!
0 50 100 150 200 250 3000
10
20
30
40
50
60
70
number of scenarios
Tim
e
Cplex/Extensive Form
L−shaped
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 44 / 82
Algorithms The LShaped Method
Two-Stage Stochastic Linear Programming
We assume that the P has finite support, so ω has a finite number ofpossible realizations (scenarios):
Q(x) =
N∑i=1
piQ(x,ωi)
For a partition of the N scenarios into sets N1,N2, . . .Nt, let Q[j](x)
be the contribution of the jth set to Q(x):
Q[j](x)def=
∑i∈Nj
piQ(x,ωi)
so then Q(x) =∑t
j=1Q[j]
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 45 / 82
Algorithms The LShaped Method
Important (and well-known) Facts
Q(x,ωi), Q[·](x), and Q(x) are piecewise linear convex functions of x.
If πi is an optimal dual solution to the linear program correspondingto Q(x,ωi), then −TT
i πi ∈ ∂Q(x,ωi)
gj(x)def=
∑i∈Nj
−piTTi πi ∈ ∂Q[j](x).
Key Idea
Represent Q[j](x) by an artificial variable θj and find supportingplanes for θj
θj ≥ Q[j](xk) + gj(x
k)T (x − xk) (∗)
Point of Decomposition
Evaluation of Q(x) is separable
We can solve linear programs corresponding to each Q(x,ωi)
independently – in parallel!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 46 / 82
Algorithms The LShaped Method
Worth 1000 Words?
x
Q(x)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 47 / 82
Algorithms The LShaped Method
Worth 1000 Words
x
Q(x)
xk
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 48 / 82
Algorithms The LShaped Method
Worth 1000 Words
x
Q(x)
x1x2
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 49 / 82
Algorithms The LShaped Method
(Multicut) L-shaped method
M
s1 s2 s3 s4 s5
M
s1 s2 s3 s4 s5
1 Solve the masterproblem M with thecurrent approximation toQ(x) for xk.
2 Solve the subproblems,(sj) evaluating Q(xk) andobtaining subgradient(s)to update masterapproximation M
3 k = k+1. Goto 1.
Let’s Get Parallel!
Of course, solution of sj can be carried out independently.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 50 / 82
Algorithms The LShaped Method
Warning!
If Q(x) is not convex, then this algorithm doesn’t work
If you have a integer recourse variables y ∈ Zp × Rn−p, the problembecomes significantly more difficult.
Your Options
Give your favorite solver the full extensive form (and pray)
Weak relaxation
Decomposition method: Carøe and Schultz, Sen
Spatial branch and bound:
Want to know about stochasticinteger programming/spatial branchand bound?
Talk to Nick!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 51 / 82
Algorithms The LShaped Method
Does it Work? The World’s Largest LP
Linderoth and Wright built a fancy decomposition-based solvercapable of running on “the grid”
Storm – A stochastic cargo-flight scheduling problem (Mulvey andRuszczynski)
We aim to solve an instance with 10,000,000 scenarios
x ∈ R121, yk ∈ R1259
The deterministic equivalent LP is of size
A ∈ R985,032,889×12,590,000,121
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 52 / 82
Algorithms The LShaped Method
The Super Storm Computer
Number Type Location
184 Intel/Linux Argonne
254 Intel/Linux New Mexico
36 Intel/Linux NCSA
265 Intel/Linux Wisconsin88 Intel/Solaris Wisconsin239 Sun/Solaris Wisconsin
124 Intel/Linux Georgia Tech90 Intel/Solaris Georgia Tech13 Sun/Solaris Georgia Tech
9 Intel/Linux Columbia U.10 Sun/Solaris Columbia U.
33 Intel/Linux Italy (INFN)
1345
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 53 / 82
Algorithms The LShaped Method
TA-DA!!!!!
Wall clock time 31:53:37CPU time 1.03 Years
Avg. # machines 433Max # machines 556Parallel Efficiency 67%
Master iterations 199CPU Time solving the master problem 1:54:37
Maximum number of rows in master problem 39647
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 54 / 82
Algorithms The LShaped Method
Number of Workers
0
100
200
300
400
500
600
0 20000 40000 60000 80000 100000 120000 140000
#wor
kers
Sec.Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 55 / 82
Sampling
Why Sampling is Necessary
ys ≡ y(ωs) is the recourse action to take if scenario ωs occurs.
Pro: It’s a linear program.
Con: It’s a BIG linear program.
Imagine the following (real) problem. A Telecom company wants toexpand its network in a way in which to meet an unknown (random)demand.
There are 86 unknown demands. Each demand is independent andmay take on one of five values.
S = |Ω| = Π86k=1(5) = 586 = 4.77× 1072
The number of subatomic particles in the universe.
How do we solve a problem that has more variables and moreconstraints than the number of subatomic particles in the universe?
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 56 / 82
Sampling
But Its Even Worse!
The answer is we can’t!
If Ω is not a countable set say if it is made up of continuous-valuedrandom variables, our “deterministic equivalent” would have ∞variables and constraints. :-)
We solve an approximating problem obtained through sampling.
The Very Good News
Using Monte-Carlo methods (Sample Average Approximation), wecan obtain high-quality solutions
Even Better: Can obtain (statistical) bounds on the quality of thesolution
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 57 / 82
Sampling
Sample Average Approximation(SAA)
The Story
Solving two-stage SP exactly is often impossible
Solving two-stage SP approximately is often easy: Sample AverageApproximation (SAA)
I view SAA as the Jeff Linderoth of solution methods
It ain’t smartIt ain’t sexyBut it generally does work!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 58 / 82
Sampling
SAA for Dummies
Let v∗ be the optimal solution to the “true” problem:
v∗def= min
x∈X
f(x)
def= EωF(x,ω)
Take a sample (ω1, ..., ωN) of N realizations of the vector ω, andform the sample average function
fN(x)def= N−1
N∑j=1
F(x,ωj)
For Stochastic LP w/recourse, evaluate fN(x) ⇒ solve one LP for eachof N scenarios
Optimize sample average function:
vNdef= min
x∈X
fN(x)def= N−1
N∑j=1
F(x,ωj)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 59 / 82
Sampling
SAA for Dummies, Cont.
Note that vN is a random variable, as it depends on the (random)sample of size N
From this information, we can get bounds on the optimal solutionvalue v∗
All “Good” Talks Contain...
Thm. E(vN) ≤ v∗ ≤ f(x) ∀x
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 60 / 82
Sampling
Making SAA Work
Take a solution x from a SAA instanceWe are mostly interested in estimating the quality of a given solutionx. This is f(x) − v∗.
1 Get upper bound on v∗ from f(x). Estimate f(x) by solving N′
(completely independent) linear programs—recourse LP’s with x fixed.
fN′(x)def= (N′)−1
N′∑j=1
F(x,ωj)
2 Get a lower bound on v∗ from E(vN). Estimate E(vN) by solving M
independent stochastic LPs, giving optimal values v1N, v2
N, . . . vMN
E(vN)def= M−1
M∑j=1
vjN
Independent ⇒ no synchronization ⇒ good for the GridIndependent ⇒ can construct confidence intervals around theestimates
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 61 / 82
Sampling
More Theory
A very interesting result of Shapiro and Homem-de-Mello says thefollowing:Suppose that x? is the unique optimal solution to the ”true” problemLet xN be the solution to the sampled approximating problemUnder certain conditions, the event (xN = x?) happens withprobability 1 for N large enough.The probability of this event approaches 1 exponentially fast asN → ∞!!There exists a constant β such that
limN→∞ N−1 log[1 − P(x = x∗)] ≤ −β.
This is a qualitative result indicating that it might not be necessary tohave a large sample size in order to solve the true problem exactly.For a problem with 51000 scenarios a sample of size N ≈ 400 isrequired in order to find the true optimal solution with probability95%!!!
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 62 / 82
Sampling
Does SAA Work on “Real” Problems?
M = 10 times – Solve a stochastic sampled approximation of size N.
Compute confidence interval on lower bound estimate E(vN)
Choose one x from solution to M SAA instances and computeconfidence interval on upper bound estimate fN′(x), with N′ = 10000
Test Instances
Name Application |Ω|
LandS HydroPower Planning 106
gbd Aircraft Allocation 6.46× 105
storm Cargo Flight Scheduling 6× 1081
20term Vehicle Assignment 1.1× 1012
ssn Telecom. Network Design 1070
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 63 / 82
Sampling
20term Convergence
251500
252000
252500
253000
253500
254000
254500
255000
255500
10 100 1000 10000
Val
ue
N
Lower BoundUpper Bound
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 64 / 82
Sampling
ssn Convergence
2
4
6
8
10
12
14
16
18
10 100 1000 10000
Val
ue
N
Lower BoundUpper Bound
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 65 / 82
Sampling
storm Convergence
1.544e+06
1.545e+06
1.546e+06
1.547e+06
1.548e+06
1.549e+06
1.55e+06
1.551e+06
1.552e+06
1.553e+06
1.554e+06
1.555e+06
10 100 1000 10000
Val
ue
N
Lower BoundUpper Bound
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 66 / 82
Sampling
gbd Convergence
1500
1550
1600
1650
1700
1750
1800
10 100 1000 10000
Val
ue
N
Lower BoundUpper Bound
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 67 / 82
Multistage SPs
Multistage Stochastic LP
ω1
x1
ω2
x2
ω3
xT−1
ωT
xT
Random vectorsω1 ∈ Rn1 ,ω2 ∈Rn2 , . . . , ωT ∈ RnT
Make sequence ofdecisions x1 ∈ X1, x2 ∈X2, . . . , xT ∈ XT .
Risk Neutral: We always aim to optimize the expected value of ourcurrent decision xt
Linear: Assume Xt are polyhedra
Discrete: Assume ωt are drawn from a discrete distribution.
The Hard Part
Decisions made at period t (xt) must only depend on events and decisionsup to period t
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 68 / 82
Multistage SPs
The Stickler. My Favorite Eight Syllable Word.
We need to enforce nonanticipativity.
Other eight-syllable words...
autosuggestibility, incommensurability, electroencephalogram,unidirectionality
At any point in time, different scenarios “look the same”
We can’t allow different decisions for these scenarios.We are not allowed to anticipate the outcome of future random eventswhen making our decision now.
How to do it?
1 Use Tree Structure (Nested Decomposition)
2 Create (extra) variables for all possible scenarios, and enfroceequality between decisions that should be nonanticipative(Progressive Hedging)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 69 / 82
Multistage SPs
Scenario Tree
xnxρ(n)
x0ξ1
ξ2
N: Set of nodes in the tree
ρ(n): Unique predecessor of noden in the tree
S(n): Set of successor nodes of n
qn: Probability that the sequenceof events leading to node n occurs
xn: Decision taken at node n
Warning!
Scenario Trees can get big
There are some tools that try and “prune” the tree while keepingsimilar statistical properties in the stochastic process
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 70 / 82
Multistage SPs
Multistage Stochastic Programming
Entensive Form
zSP = min
∑n∈N
qncTnxn
∣∣ Tnxρ(n) + Wnxn = hn ∀n ∈ N
Value Function of node n
Qn(xρ(n))def= min
xn
cTnxn +
∑m∈S(n)
qmnQm(xn) | Wnxn = hn − Tnxρ(n)
qmn: conditional probability of node n given node m
Tree structure encodes nonanticipativity
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 71 / 82
Multistage SPs Algorithm
Nested Decomposition
0: Root node of the scenario tree
x0: Initial state of the system
Recursive Formulation
zSP = Q0(x0)
Cost to go: Gn(x)def=
∑m∈S(n) qmnQm(x)
Mkn(x): Lower bound on Gn(x) in iteration k
Qn(xρ(n)) ≥ minxn
cT
nxn + Mkn(xn)
∣∣ Wnxn = hn − Tnxρ(n)
((MLPn))
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 72 / 82
Multistage SPs Algorithm
Building Mkn(x)
Create a partition (or clustering Cn) of S(n)
A lower bound mkn[j] for each element of the partition (each cluster)
is created independently
Mkn(x)
def=
∑j∈Cn
mkn[j](x)
mkn[j](x)
def= inf
θj
∣∣∣ θje ≥ Fkn[j]x + fk
n[j]
Fk
n[j], fn[j] obtained from dual solutions (to form subgradients) of
linear programs of nodes within cluster [j]
Mkn(x∗) → Gn(x∗)
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 73 / 82
Multistage SPs Algorithm
Action Pictures
x0
ξ1
x0
ξ1
ξ2
x0
ξ1
ξ2ξ3
x0
(Fkn[j], f
kn[j])
x0
(Fkn[j], f
kn[j])
x0
(Fkn[j], f
kn[j])
x0
x0
1 Solve MLP0 to get x0. Sendpolicy forward
2 Solve each MLPS0using x0 and
realizations ξ1
3 Continue forward to end
4 Go backwards. Send cuts fromchildren back to parent. UpdateMLPn and resolve.
5 Lather, Rinse, Repeat.
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 74 / 82
Multistage SPs Algorithm
A small Multistage Telecom Problem
A
B C D
E F
Set of stages T
Set J of links
Sets It of demands
Random demand dt(ξ) ∈ R|It|
Budget each period
Install capacity on links eachperiod to minimize the totalexpected unserved demand
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 75 / 82
Multistage SPs Algorithm
Some (Limited) Computational Results
T = 5
K: Realizations/Period
N: Number of scenarios
DE: Size of deterministicequivalent
K N DE Size30 0.81M 18M * 31M50 6.25M 140M * 236M60 12.9M 290M * 488M
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 76 / 82
Multistage SPs Algorithm
Computational Results
It: Number of iterations (Times MLP0 was solved)
E: Parallel efficiency.
Time machines solving MLPn
Time machines available
K It Avg Workers Wall Time CPU Time E30 9 62 2:34:21 6:15:15:10 6750 7 75 1:12:49:27 85:20:24:15 7760 11 162 3:16:51:00 431:12:15:37 73
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 77 / 82
Multistage SPs Modeling Tools
Existing Modeling Tools
Many stochastic programming implementations I’m aware of havebeen built from scratch
But there are some modeling tools on the way
Name Author(s) CommentAIMMS AIMMS Team CommercialGams Gams Team CommercialMPL Kristjensen Commercial
XPRESS-SP Verma, Dash Opt. Commercial, BetaSPiNE Valente, CARISMA
STRUMS Fourer and Lopes Prototype(?)SUTIL Czyzyk and Linderoth C++ classesSLPLib Felt, Sarich, Ariyawansa Open Source C Routines
COIN-Smi, SP/OSL COIN, IBM C++ methods
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 78 / 82
Multistage SPs Modeling Tools
Existing Solution Tools
Most stochastic programming implementations of which I’m aware,merely form and solve extensive form
Other software:
Name Author(s) CommentAIMMS AIMMS Team Commercial, LShaped methodSLP-IOR Kall, Mayer LShaped, Stochastic Decomposition, othersMSLiP Gassmann Nested LShapedSPInE Valente, CARISMA Commercial, LShaped method, may not exist anymoreBNBS Altenstedt Nested LShaped method, Open sourceATR Linderoth, Wright Design to run in parallel. Not simple to build and run
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 79 / 82
Multistage SPs Modeling Tools
Conclusions
Stochastic Programming
A tool for decision making under uncertainty
Considers the impact of recourse decisionsIt may not be the answer, but it does help you hedge againstupcoming uncertaintyMore importantly, it gets people talking about the impact ofuncertainty in the decision making process
Planning with “mean-value” estimates will not lead to an optimalpolicy
Used with some success in industry
Financial Services (Many successes)Logistics and Supply Chain (Fewer successes, but coming!)
Tools and algorithms are “on the way”
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 80 / 82
Multistage SPs Modeling Tools
We Want YOU!
To consider using StochasticProgramming as a decisionsupport tool to help managein turbulent times!
Thanks!
I am happy to help. email: [email protected]
http://www.stoprog.org/
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 81 / 82
Multistage SPs Modeling Tools
Some Take Away Quotes
“If a man will begin with certainty, he shall end in doubts, but ifhe will be content to beign with doubts, he shall end incertainties”
— Francis Bacon
“It is a good thing for theuneducated person to readbooks of quotations”
—Winston Churchill
Jeff Linderoth (UW-Madison) Models & Algs. for SP CMU-EWO 82 / 82