chapter 13

Chapter 13

Constraint OptimizationAnd counting, and

enumeration275 class

Outline Introduction

Optimization tasks for graphical models Solving optimization problems with inference and

search Inference

Bucket elimination, dynamic programming Mini-bucket elimination

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

A Bred greenred yellowgreen redgreen yellowyellow greenyellow red

Example: map coloring Variables - countries (A,B,C,etc.)

Values - colors (e.g., red, green, yellow)

Constraints: etc. ,ED D, AB,A

Task: consistency?Find a solution, all solutions, counting

Constraint Satisfaction

= {(¬C), (A v B v C), (¬A v B v E), (¬B v C v D)}.

Propositional Satisfiability

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

i i XfXF1

FunctionCost Global

f(A,B,D) has scope {A,B,D}

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

i i XfXF1

FunctionCost Global

Primal graph =Variables --> nodesFunctions, Constraints - arcs

f1(A,B,D)f2(D,F,G)f3(B,C,F)

f(A,B,D) has scope {A,B,D}

F(a,b,c,d,f,g)= f1(a,b,d)+f2(d,f,g)+f3(b,c,f)

Constrained Optimization

Example: power plant scheduling

)X,...,ost(XTotalFuelC minimize :

)(Power : demandpower time,down-min and up-min ,, :sConstraint

. domain ,Variables

Objective

DemandXXXXX

{ON,OFF}},...,X{X

Probabilistic Networks

Smoking

BronchitisCancer

Dyspnoea

P(B|S)

P(D|C,B)

P(C|S)

P(X|C,S)

P(S,C,B,X,D) = P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B)

C BD=0

0 0 0.1 0.9

0 1 0.7 0.3

1 0 0.8 0.2

1 1 0.9 0.1

P(D|C,B)

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming,

tree-clustering, bucket-elimination Mini-bucket elimination, belief propagation

Computing MPE

“Moral” graph

bcde ,,,0max

P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=

),,,( ecdahB

bmax P(b|a)P(d|b,a)P(e|b,c)

Variable Elimination

P(c|a)c

maxElimination operator

bucket B:

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e)(a,hD

e)c,d,(a,hB

e)d,(a,hC

Finding )xP(maxMPEx

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Generating the MPE-tuple

P(b|a) P(d|b,a) P(e|b,c)B:

A: P(a)

P(c|a)

e=0 e)(a,hD

e)c,d,(a,hB

e)d,(a,hC

(a)hP(a)max arga' 1. E

0e' 2.

)e'd,,(a'hmax argd' 3. C

)e'c,,d',(a'h

)a'|P(cmax argc' 4.B

)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.

)e',d',c',b',(a' Return

maxElimination operator

exp(W*=4)”induced width” (max clique size)

bucket B:

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e)(a,hD

e)c,d,(a,hB

e)d,(a,hC

Complexity

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Complexity of bucket elimination

))((exp ( * dwrOddw ordering alonggraph primal theof width induced the)(*

The effect of the ordering:

4)( 1* dw 2)( 2

* dwconstraint graph

Finding smallest induced-width is hard

r = number of functions

Bucket-elimination is time and space

Directional i-consistency

AD C,D :D

BE C,E D,E :E

Adaptive d-arcd-path

DBDC RR ,CBR

DRCRDR

Mini-bucket approximation: MPE task

Split a bucket into mini-buckets =>bound complexity

XX gh )()()O(e :decrease complexity lExponentia n rnr eOeO

Mini-Bucket Elimination

P(B|A) P(C|A)

P(E|B,C)

P(D|A,B)

Bucket B

Bucket C

Bucket D

Bucket E

Bucket A

P(B|A) P(D|A,B)P(E|B,C)

P(C|A)

maxB∏

hB (A,D)

MPE* is an upper bound on MPE --UGenerating a solution yields a lower bound--L

maxB∏

hD (A)

hC (A,E)

hB (C,E)

hE (A)

MBE-MPE(i) Algorithm Approx-MPE (Dechter&Rish 1997)

Input: i – max number of variables allowed in a mini-bucket Output: [lower bound (cost of a sub-optimal solution), upper bound]

Example: approx-mpe(3) versus elim-mpe

2* w 4* w

Properties of MBE(i)

Complexity: O(r exp(i)) time and O(exp(i)) space. Yields an upper-bound and a lower-bound.

Accuracy: determined by upper/lower (U/L) bound.

As i increases, both accuracy and complexity increase.

Possible use of mini-bucket approximations: As anytime algorithms As heuristics in search

Other tasks: similar mini-bucket approximations for: belief updating, MAP and MEU (Dechter and Rish, 1997)

Inference Bucket elimination, dynamic programming Mini-bucket elimination

The Search Space

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

Objective function:

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

The Search Space

Arc-cost is calculated based on cost components.

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 02 23 02 23 02 23 02 2 3 02 23 02 23 02 23 02 2

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

The Value Function

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

An Optimal Solution

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

Basic Heuristic Search Schemes

Heuristic function f(x) computes a lower bound on the best

extension of x and can be used to guide a heuristic search algorithm. We focus on

1.Branch and BoundUse heuristic function f(xp) to prune the depth-first search tree.Linear space

2.Best-First SearchAlways expand the node with the highest heuristic value f(xp).Needs lots of memory

Classic Branch-and-Bound

g(n)g(n)

h(n)h(n)

LB(n) = g(n) + h(n)LB(n) = g(n) + h(n)

Lower Bound Lower Bound LBLB

OR Search Tree

Prune if LB(n) ≥ UBPrune if LB(n) ≥ UB

Upper Bound Upper Bound UBUB

How to Generate Heuristics

The principle of relaxed models

Linear optimization for integer programs

Mini-bucket elimination Bounded directional consistency ideas

Generating Heuristic for graphical models(Kask and Dechter, 1999)

Given a cost function

C(a,b,c,d,e) = f(a) • f(b,a) • f(c,a) • f(e,b,c) • P(d,b,a)

Define an evaluation function over a partial assignment as theprobability of it’s best extension

f*(a,e,d) = minb,c f(a,b,c,d,e) = = f(a) • minb,c f(b,a) • P(c,a) • P(e,b,c) • P(d,a,b)

= g(a,e,d) • H*(a,e,d)

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Static MBE Heuristics Given a partial assignment xp, estimate the cost of the

best extension to a full solution The evaluation function f(x^p) can be computed using

function recorded by the Mini-Bucket scheme

B: P(E|B,C) P(D|A,B) P(B|A)

C: P(C|A) hB(E,C)

hB(D,A)

hC(E,A)

P(A) hE(A) hD(A)

f(a,e,D) = P(a) · hB(D,a) · hC(e,a)

g h – is admissible

Belief Network

f(a,e,D))=g(a,e) · H(a,e,D )

Heuristics Properties

MB Heuristic is monotone, admissible

Retrieved in linear time IMPORTANT:

Heuristic strength can vary by MB(i). Higher i-bound more pre-processing

stronger heuristic less search.

Allows controlled trade-off between preprocessing and search

Experimental Methodology

Algorithms BBMB(i) – Branch and Bound with MB(i) BBFB(i) - Best-First with MB(i) MBE(i)

Test networks: Random Coding (Bayesian) CPCS (Bayesian) Random (CSP)

Measures of performance Compare accuracy given a fixed amount of time - how

close is the cost found to the optimal solution Compare trade-off performance as a function of time

Empirical Evaluation of mini-bucket heuristics,Bayesian networks, coding

Time [sec]

0 10 20 30

BBMB i=2

BFMB i=2

BBMB i=6

BFMB i=6

BBMB i=10

BFMB i=10

BBMB i=14

BFMB i=14

Random Coding, K=100, noise=0.28 Random Coding, K=100, noise 0.32

Time [sec]

0 10 20 30

BBMB i=6 BFMB i=6 BBMB i=10 BFMB i=10 BBMB i=14 BFMB i=14

Random Coding, K=100, noise=0.32

Max-CSP experiments(Kask and Dechter, 2000)

Dynamic MB Heuristics

Rather than pre-compiling, the mini-bucket heuristics can be generated during search

Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node in the search space

(a partial assignment, along the given variable ordering)

Dynamic MB and MBTE Heuristics(Kask, Marinescu and Dechter, 2003)

Rather than precompile compute the heuristics during search

Dynamic MB: Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node during search

Dynamic MBTE: We can compute heuristics simultaneously for all un-instantiated variables using mini-bucket-tree elimination .

MBTE is an approximation scheme defined over cluster-trees. It outputs multiple bounds for each variable and value extension at once.

),|()|()(),()2,1( bacpabpapcbha

),(),|()|(),( )2,3(,

)1,2( fbhdcfpbdpcbhfd

),(),|()|(),( )2,1(,

)3,2( cbhdcfpbdpfbhdc

),(),|(),( )3,4()2,3( fehfbepfbhe

),(),|(),( )3,2()4,3( fbhfbepfehb

),|(),()3,4( fegGpfeh e

Cluster Tree Elimination - example

Mini-Clustering

Motivation: Time and space complexity of Cluster Tree Elimination

depend on the induced width w* of the problem When the induced width w* is big, CTE algorithm becomes

infeasible The basic idea:

Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables

Accuracy parameter i = maximum number of variables in a mini-cluster

The idea was explored for variable elimination (Mini-Bucket)

Split a cluster into mini-clusters => bound complexity

)()( :decrease complexity lExponentia rnrn eOeO)O(e

Idea of Mini-Clustering

},...,,,...,{ 11 nrr hhhh )(ucluster

},...,{ 1 rhh },...,{ 1 nr hh

ii hhg

),|()|()(:),(1)2,1( bacpabpapcbh

)2,1(H

dcfpch

fbhbdpbh

2)1,2(

1)2,3(

1)1,2(

),|(:)(

),()|(:)(

)1,2(H

dcfpfh

cbhbdpbh

2)3,2(

1)2,1(

1)3,2(

),|(:)(

),()|(:)(

)3,2(H

),(),|(:),( 1)3,4(

1)2,3( fehfbepfbh

)2,3(H

)()(),|(:),( 2)3,2(

1)3,2(

1)4,3( fhbhfbepfeh

)4,3(H

),|(:),(1)3,4( fegGpfeh e)3,4(H

Mini-Clustering - example

),()2,1( cbh1

),()1,2( cbh

),()3,2( fbh

),()2,3( fbh

),()4,3( feh

),()3,4( fehEF

),(1)2,1( cbh

1)1,2(

1)3,2(

),(1)2,3( fbh

),(1)4,3( feh

),(1)3,4( feh

)2,1(H

)1,2(H

)3,2(H

)2,3(H

)4,3(H

)3,4(H

Mini Bucket Tree Elimination

Mini-Clustering

Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) for each variable and each of its values.

MBTE: when the clusters are buckets in BTE.

Branch and Bound w/ Mini-Buckets

BB with static Mini-Bucket Heuristics (s-BBMB) Heuristic information is pre-compiled before search.

Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket Heuristics (d-BBMB)

Heuristic information is assembled during search. Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket-Tree Heuristics (BBBT)

Heuristic information is assembled during search. Dynamic variable ordering, prunes all future variables

Empirical Evaluation Algorithms:

Complete BBBT BBMB

Incomplete DLM GLS SLS IJGP IBP (coding)

Measures: Time Accuracy (% exact) #Backtracks Bit Error Rate (coding)

Benchmarks: Coding networks Bayesian Network

Repository Grid networks (N-by-N) Random noisy-OR networks Random networks

Real World Benchmarks

Average Accuracy and Time. 30 samples, 10 observations, 30 seconds

Empirical Results: Max-CSP Random Binary Problems: <N, K, C, T>

N: number of variables K: domain size C: number of constraints T: Tightness

Task: Max-CSP

BBBT(i) vs. BBMB(i).

BBBT(i) vs BBMB(i), N=100

i=2 i=3 i=4 i=5 i=6 i=7 i=2

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1

2 0 0 4

3 1 5 4 0 2 2 5

3 5 1 3 13

2 24 1

0 56 4

5 2 3 0 30

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1

0 2 1 0

3 3 1 1 2 2 0 0

8 5 3 1 7 4 2 0

6 5 7 4

2 0 0 4

3 1 5 4 0 2 2 5

3 5 1 3 13

2 24 1

0 56 4

2 24 1

5 2 3 0 30

Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation

chapter 13

b v c v d

v b v c

v b v e

upper bound example

lower bound cost

mpe tasksplit

x j cjmaxcsp

functions constraints

Documents

13-chapter 13 - transistors

chapter 13 chapter 13: chemical reactions

13-188 cmr ch. 25 - maine€¦ · web view13-188 chapter 25...

chapter 13 - manifest destiny -...

chapter 13 1 chapter 13 - textbooks.elsevier.com

ap statistics chapter 13: chapter 13: experiments &...

chapter 13 chapter 13 managing yourself and your time

chapter-13: least mean squares (lms) algorithm marc...

human population chapter 13 human population chapter 13

chapter 13 investing fundamentals chapter 13 investing...

chapter 13 assessment chapter 13 assessment

chapter 13 electrochemistry and cell...

13. bÖlÜm / chapter 13

chapter 13 unsaturated...

chapter 13-1. chapter 13-2 chapter 13 statement of cash...

chapter 13 chapter 13: chemical reactions chemical reactions

chapter 13 musculoskeletal and connective tissue - … 13 -...

chapter 13

chapter 13 equities chapter 13 equities mark higgins

fast forward 13 - moreeinfo.com 13/asprin book 13.pdf ·...