decision diagrams and dynamic programmingpublic.tepper.cmu.edu/jnh/ddanddp2slides.pdf · dynamic...

67
Decision Diagrams and Dynamic Programming J. N. Hooker Carnegie Mellon University CPAIOR 2013

Upload: others

Post on 12-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagramsand Dynamic Programming

J. N. HookerCarnegie Mellon University

CPAIOR 2013

Page 2: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams & Dynamic Programming

● Binary/multivalued decision diagrams are related to dynamic programming.

– But there are importantdifferences.

Page 3: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams & Dynamic Programming

● Binary/multivalued decision diagrams are related to dynamic programming.

– But there are importantdifferences.

– Dynamic programminghas state variablesand state-dependentcosts.

Page 4: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams & Dynamic Programming

● We extend the theory of decision diagrams to accommodatestate-dependent-costs.

– We prove uniquenesstheorem for weightedDDs using canonicalcosts.

Page 5: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams & Dynamic Programming

● We extend the theory of decision diagrams to accommodatestate-dependent-costs.

– We prove uniquenesstheorem for weightedDDs using canonicalcosts.

• We can now view DPstate transition graphas a decision diagram.

– And perhaps reduce the decision diagramto simplify the DP model.

Page 6: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Outline

● Dynamic programming example

● Weighted decision diagrams and canonical costs

● Application to the example

● Future work

Page 7: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Dynamic Programming

• Dynamic programming (including the name) was introduced by Richard Bellman in 1950s.

– Different concept than decision diagram, caching, etc.

– But DP state transition graph can be viewed as a weighted decision diagram.

• Illustration: a very basic inventory management problem.

– In the literature at least 50 years.

Page 8: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Inventory Management Example

• In each period i, we have:

– Demand di.

– Unit production cost ci.

– Warehouse space m.

– Unit holding cost hi.

• In each period, we decide:

– Production level xi.

– Stock level si.

• Objective:

– Meet demand each period while minimizing production and holding costs.

Page 9: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

0 21

0 21

0 21

0

0

Period i = 1

i = 2

i = 3

i = 4

State si = stock level

Demand di = 2 in each period i

State transition graph

Page 10: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

0 21

0 21

0 21

0

0

Period i = 1

i = 2

i = 3

i = 4

State si = stock level

Demand di = 2 in each period i

Transition xi = 4 units manufactured

State transition graph

Page 11: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

0 21

0 21

0 21

0

0

Period i = 1

i = 2

i = 3

i = 4

State si = stock level

Demand di = 2 in each period i

Transition xi = 4 units manufactured

Each path represents a solution

State transition graph

Page 12: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

Transition cost(immediate cost)

h2 = 2

h3 = 1

h4 = 2

c1 = 2

c2 = 3

c3 = 5

c4 = 6

Unit holding cost = hiUnit manufacturing cost = ci

i i i ih s c x+

Transition costs

Page 13: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

h2 = 2

h3 = 1

h4 = 2

c1 = 2

c2 = 3

c3 = 5

c4 = 6

Cost to go gi (si)

Backward recursion

Page 14: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

1( ) ( )i i i i i i i i i ig s h s c x g s x d+= + + + −

Backward recursion

Page 15: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

22 18 14

{ }1( ) min ( )i

i i i i i i i i i ixg s h s c x g s x d+= + + + −

Backward recursion

Page 16: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

22 18 14

26 25

24

{ }1( ) min ( )i

i i i i i i i i i ixg s h s c x g s x d+= + + + −

Backward recursion

Page 17: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

22 18 14

26 25

24

30

{ }1( ) min ( )i

i i i i i i i i i ixg s h s c x g s x d+= + + + −

Backward recursion

Page 18: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

22 18 14

26

30

Optimal solution

Trace forward to find optimal path

Page 19: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12 8 4

14

26

30

Optimal solution

Trace forward to find optimal path

Page 20: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

0

12

14

26

30

Optimal solution

Trace forward to find optimal path

Page 21: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Dynamic Programming Recursion

• In general, the state transition is

• Cost is a function of state and control pairs

• The recursion is

– with boundary condition

– and optimal value for starting state s1

1 ( , ), 1, ,i i i is s x i nϕ+ = = K

1

( ) ( , )n

i i ii

f x c s x=

=∑

( ){ }1( ) min ( , ) ( , )i

i i i i i i i i ixg s c s x g s xϕ+= +

1 1 1( ) 0, all n n ng s s+ + +=

1 1( )g s

Page 22: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Dynamic Programming Characteristics

• There are state variables in addition to decision variables.

• Costs are function of state variables as well as decision variables.

• State transitions are Markovian.

– Current state determines possible transitions and costs.

• Problem is solved recursively.

– Often by moving backward through stages.

• The art of dynamic programming:

– Find a small state description that is Markovian.

Page 23: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

DP vs Caching

• Dynamic programming ≠≠≠≠ caching

– Yes, DP identifies equivalent subproblems.

Page 24: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

DP vs Caching

• Dynamic programming ≠≠≠≠ caching

– Yes, DP identifies equivalent subproblems.

– But not by identifying equivalent states.

– All states are treated separately (except in approximate DP).

• The intelligence is in the state description.

Page 25: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

DP vs Caching

• However, caching can be applied on top of DP.

– We will use the concept of reduced decision diagram (reduced MDD) to identify equivalent states.

• Problem: how to deal with state-dependent costs.

Page 26: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

Arcs leaving each node are very similar.• Transition to the same

states.• Have the same costs,

up to an offset.

Reducing the Transition Graph

Page 27: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

2+5

1+15

4+3

2+9

0+6

0 21

0 21

0 21

0

0

0+8

0+60+4

0+12

0+9

0+10

0+20

0+15

4+6

4+0

2+10

2+0

4+0

0+12

2+3

2+6

1+5

1+10

2+6

Arcs leaving each node are very similar.• Transition to the same

states.• Have the same costs,

up to an offset.

Incidentally, there is also a bang-bang solution.

Reducing the Transition Graph

Page 28: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

12

0

Reducing the Transition Graph

13 14

10 9 8

6 7 8

4

By rearranging the costs, we can collapse the states in each period.

x1 = 2x1 = 3

x1 = 4

Page 29: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

12

0

Reducing the Transition Graph

13 14

10 9 8

6 7 8

4

By rearranging the costs, we can collapse the states in each period.

Now it is easier to compute the optimal solution

0

0

12

20

26

30

x1 = 2x1 = 3

x1 = 4

Page 30: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

12

0

Reducing the Transition Graph

13 14

10 9 8

6 7 8

4

By rearranging the costs, we can collapse the states in each period.

Now it is easier to compute the optimal solution

This looks like reduction of a decision diagram (MDD).

We will develop this idea in general.

0

0

12

20

26

30

x1 = 2x1 = 3

x1 = 4

Page 31: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams

Set covering example

Select a minimum-weight family of sets that contain all 4 elements A, B, C, D

xi = 1 when we select set i

Weight 3 5 4 6

Page 32: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Decision Diagrams

Decision diagram

Each path corresponds to a feasible solution.

xi = 1 when we select set i

Weight 3 5 4 6

x1 = 0 x1 = 1

Page 33: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Separable cost function

Just label arcs with weights.

Shortest path correspondsto an optimal solution.

xi = 1 when we select set i

Weight 3 5 4 6

Page 34: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

• State-dependent costs in dynamic programming imply a nonseparable cost function:

where

– We need a theory of decision diagrams that deals with nonseparable costs.

1

( ) ( , )n

i i ii

f x c s x=

=∑

1 ( , ), 1, ,i i i is s x i nϕ+ = = K

Page 35: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Now what?

Page 36: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

Page 37: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

But now we can’treduce the treeto an efficientdecision diagram.

Page 38: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

But now we can’treduce the treeto an efficientdecision diagram.

We will rearrangecosts to obtaincanonical costs.

Page 39: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

But now we can’treduce the treeto an efficientdecision diagram.

We will rearrangecosts to obtaincanonical costs.

0

6

0 1

7

0

5

0 2 0 2

6 7

Page 40: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

But now we can’treduce the treeto an efficientdecision diagram.

We will rearrangecosts to obtaincanonical costs.

0

6

0 1

7

0

5

0 2 0 2

6 70 1

6

0 10

5 6

Page 41: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Put costs on leavesof branching tree.

But now we can’treduce the treeto an efficientdecision diagram.

We will rearrangecosts to obtaincanonical costs.

0

6

0 1

7

0

5

0 2 0 2

6 70 1

6

0 10

5 60 0 1

6 5

Page 42: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Now the tree canbe reduced.

0

6

0 1

7

0

5

0 2 0 2

6 70 1

6

0 10

5 60 0 1

6 5

Page 43: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Now the tree canbe reduced.

Page 44: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

Note that DD is larger than reduced unweighted DD,but still compact.

Page 45: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

We can represent anydiscrete optimizationproblem with such adecision diagram�

even if the costs arenonseparable.

Page 46: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Nonseparable cost function

We know that without weights,there is a unique reduceddecision diagram for a givenvariable ordering.

Is this true for decision diagrams with canonicalweights?

Yes.

Page 47: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Definition. Costs on a decision diagram are canonical if for every node in layer i, the costs cij leaving that node satisfy

for fixed αi (e.g., 0).{ }min ij ijc α=

Page 48: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Definition. Costs on a decision diagram are canonical if for every node in layer i, the costs cij leaving that node satisfy

for fixed αi (e.g., 0).

Theorem. Any given discrete optimization problem is uniquelyrepresented by a weighted decision diagram with canonical costs,for a given variable ordering.

{ }min ij ijc α=

Page 49: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Definition. Costs on a decision diagram are canonical if for every node in layer i, the costs cij leaving that node satisfy

for fixed αi (e.g., 0).

Theorem. Any given discrete optimization problem is uniquelyrepresented by a weighted decision diagram with canonical costs,for a given variable ordering.

• Similar result proved for Affine Algebraic Decision Diagrams (AADDs) by Sanner and McAllester (IJCAI 2005).

– Definition of canonical is somewhat different.

{ }min ij ijc α=

Page 50: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

• Converting to canonical costs does not destroy the benefits of separability.

Definition. A decision diagram is separable when arc costs represent terms of a separable cost function.

Theorem. A separable decision diagram that is reduced when costs are ignored is also reduced when costs are converted to canonical costs.

Page 51: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Weighted Decision Diagrams

Example

Reduced unweighted DD

Add separable costs

Reduced weighted DD with canonical costs

has same shape

Page 52: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

0 21

0 21

0 21

0

0

{ }1( ) min ( )i

i i i i i i i i i ixg s h s c x g s x d+= + + + −

Application to Inventory Problem

1 2x = 1 3x = 1 4x =

To equalize controls, let

Be the stock level in next period.i i i ix s x d′ = + −

7

16

7

11

6

8

64

12

9

10

20

15

10

4

12

2

4

12

5

8

611

8

23

4 1 2 3 0

12

Page 53: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

0 21

0 21

0 21

0

0

{ }1( ) min ( )i

i i i i i i i i i ixg s h s c x g s x d+= + + + −

Application to Inventory Problem

1 0x ′ = 1 1x ′ = 1 2x ′ =

To equalize controls, let

Be the stock level in next period.i i i ix s x d′ = + −

7

16

7

11

6

8

64

12

9

10

20

15

10

4

12

2

4

12

5

8

611

8

01

2 0 1 2 0

12

Page 54: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

7

16

7

11

6

0 21

0 21

0 21

0

0

8

64

12

9

10

20

15

10

4

12

2

4

12

5

8

611

8

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

1 0x ′ = 1 1x ′ = 1 2x ′ =

To equalize controls, let

Be the stock level in next period.i i i ix s x d′ = + −

01

2 0 1 2 0

12

New recursion:

Page 55: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

To obtain canonical costs,subtractfrom cost on each arc (si,si+1).

( )i i i ic m s h s− +

7

16

7

11

6

0 21

0 21

0 21

0

0

8

64

12

9

10

20

15

10

4

12

2

4

12

5

8

611

8

Page 56: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

To obtain canonical costs,subtractfrom cost on each arc (si,si+1).

Add these offsets to incoming arcs.

( )i i i ic m s h s− +

5

10

3

6

0

0 21

0 21

0 21

0

0

4

20

6

3

0

10

5

6

0

10

0

4

0

0

3

05

0

Page 57: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

To obtain canonical costs,subtractfrom cost on each arc (si,si+1).

Add these offsets to incoming arcs.

( )i i i ic m s h s− +

13

14

9

8

10

0 21

0 21

0 21

0

0

8

76

8

9

12

14

13

8

10

14

12

0

0

10

9

1213

0

4

Page 58: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

To obtain canonical costs,subtractfrom cost on each arc (si,si+1).

Add these offsets to incoming arcs.

Now outgoing arcs look alike.

And all arcs into state sihave the same cost

( )i i i ic m s h s− +

1 1 1 1 1 1( ) ( ) ( )i i i i i i i i ic s s h c d s m c m s+ + + + + += + − − + −

13

14

9

8

10

0 21

0 21

0 21

0

0

8

76

8

9

12

14

13

8

10

14

12

0

0

10

9

1213

0

4

Page 59: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1( ) min ( ) ( )i

i i i i i i i i i ix

g s h s c x s d g x+′′ ′= + − + +

Application to Inventory Problem

These are canonical costs with

13

14

9

8

10

0 21

0 21

0 21

0

0

8

76

8

9

12

14

13

8

10

14

12

0

0

10

9

1213

0

{ }1

1min ( )i

i i isc sα

++=

4

Page 60: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1 1 1min ( ) ( )i

i i i i i i i i ix

g h x c x m d c m x g+ + +′′ ′ ′= + − + + − +

Application to Inventory Problem

13

14

9

8

10

0 21

0 21

0 21

0

0

8

76

8

9

12

14

13

8

10

14

12

0

0

10

9

1213

0

New recursion:

These are canonical costs with

{ }1

1min ( )i

i i isc sα

++=

4

Page 61: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1 1 1min ( ) ( )i

i i i i i i i i ix

g h x c x m d c m x g+ + +′′ ′ ′= + − + + − +

Application to Inventory Problem

Now there is only one state per period.

New recursion:

12

0

13 14

10 9 8

6 7 8

4

0

0

12

20

26

30

Page 62: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

{ }1 1 1min ( ) ( )i

i i i i i i i i ix

g h x c x m d c m x g+ + +′′ ′ ′= + − + + − +

Application to Inventory Problem

Now there is only one state per period.

Note that computational tests are not necessary.

We immediately see the speedup from the reduction in the state space.

New recursion:

12

0

13 14

10 9 8

6 7 8

4

0

0

12

20

26

30

Page 63: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Future Research

• DP model simplification

– Go through the classical DP models and see under what conditions they can be simplified.

Page 64: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Future Research

• DP model simplification

– Go through the classical DP models and see under what conditions they can be simplified.

• New approach to approximate DP

– Approximate DP merges similar states to reduce state space.

• Approximation scheme is normally static (fixed in advance).

Page 65: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Future Research

• DP model simplification

– Go through the classical DP models and see under what conditions they can be simplified.

• New approach to approximate DP

– Approximate DP merges similar states to reduce state space.

• Approximation scheme is normally static (fixed in advance).

– Relaxation techniques for decision diagrams do the same.

• Choose which nodes to merge as the diagram is built.

Page 66: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Future Research

• DP model simplification

– Go through the classical DP models and see under what conditions they can be simplified.

• New approach to approximate DP

– Approximate DP merges similar states to reduce state space.

• Approximation scheme is normally static (fixed in advance).

– Relaxation techniques for decision diagrams do the same.

• Choose which nodes to merge as the diagram is built.

– Apply these techniques to state transition graph.

Page 67: Decision Diagrams and Dynamic Programmingpublic.tepper.cmu.edu/jnh/DDandDP2slides.pdf · Dynamic Programming • Dynamic programming (including the name) was introduced by Richard

Future Research

• DP model simplification

– Go through the classical DP models and see under what conditions they can be simplified.

• New approach to approximate DP

– Approximate DP merges similar states to reduce state space.

• Approximation scheme is normally static (fixed in advance).

– Relaxation techniques for decision diagrams do the same.

• Choose which nodes to merge as the diagram is built.

– Apply these techniques to state transition graph.

– Result is a dynamic dynamic programming• DP with dynamic state approximation based on information

in the decision diagram.