compiling graphical models

109
Compiling Graphical Models Adnan Darwiche University of California, Los Angeles UAI’06 Tutorial

Upload: tanuja

Post on 10-Jan-2016

63 views

Category:

Documents


2 download

DESCRIPTION

Compiling Graphical Models. Adnan Darwiche University of California, Los Angeles UAI’06 Tutorial. Compilation: Historical Motivation. Separate inference into two phases: Offline : Compile model into a structure Online : Use structure to answer queries - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Compiling Graphical Models

Compiling Graphical Models

Adnan DarwicheUniversity of California, Los

Angeles

UAI’06 Tutorial

Page 2: Compiling Graphical Models

Compilation: Historical Motivation

Separate inference into two phases: Offline: Compile model into a structure Online: Use structure to answer queries

Goal: Push as much work into offline phase to optimize online inference time

Best initial example: Offline: Compile a Bayesian network into a jointree Online: Use jointree to answer multiple queries

efficiently

Page 3: Compiling Graphical Models

Compilation: Modern Motivation

Exploit model structure in inference: Global structure:

Exhibited in model topology Measured by treewidth Exploited by most (non-compilation) algorithms

Local structure: Exhibited in model parameters Type 1: Determinism Type 2: Context-specific independence

Local structure is best exploited in the context of compilation: main theme

Page 4: Compiling Graphical Models

Compilation: Theoretical Implications

Unifies inference paradigms Variable elimination Jointree (Tree clustering) Conditioning

Compilation as a trace of classical inference

Page 5: Compiling Graphical Models

Bayesian Networks

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Local Knowledge

Page 6: Compiling Graphical Models

Bayesian Networks

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

ON OFF

OK

WEAK

DEAD

Lights

Batt

ery

P

ow

er .99 .01

.20 .80

0 1

If Battery Power = OK, then Lights = ON (99%)

….

Page 7: Compiling Graphical Models

Bayesian Networks

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Page 8: Compiling Graphical Models

Global Structure:Treewidth w

))exp(( wnO

Page 9: Compiling Graphical Models

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Local Structure:CSI and Determinism

Page 10: Compiling Graphical Models

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Context Specific Independence (CSI)

Local Structure:CSI and Determinism

Page 11: Compiling Graphical Models

Local Structure:CSI and Determinism

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

ON OFF

OK

WEAK

DEAD

Lights

Batt

ery

P

ow

er .99 .01

.20 .80

0 1

If Battery Power = Dead,

then Lights = OFF

Determinism

Page 12: Compiling Graphical Models

Today’s Models …

Characterized by: Richness in local structure (determinism, CSI) Massiveness in size (100,000’s variables not

uncommon) High connectivity (treewidth > 50, > 100)

Enabled by: High level modeling tools: relational, first order New application areas (synthesis):

Bioinformatics (e.g. linkage analysis) Sensor networks

Exploiting local structure a must!

Page 13: Compiling Graphical Models

High Order Specifications:Relational Models…

burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}

burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}

Primula

Page 14: Compiling Graphical Models
Page 15: Compiling Graphical Models

Friends and Smokers (Richardson & Domingos, 2004)

M individuals Relations such as

smokes(p), cancer(p), friend(p1,p2)

Logical constraints such as: if one of p's friends smokes, then p smokes.

Sample Query: probability that given person has cancer

77,65621,91217286,46416

42,13311,93411846,93013

19,5255,5657021,76010

6,9161,995367,7147

1,390414131,5524

31123341

451,965126,614560502,80229

407,218114,114528453,04028

290,87581,600412323,65025

199,11155,935316221,58422

129,01036,309244143,60219

CnfClauses

CNFVars

Treewidth*

w

Networkparams

M

Page 16: Compiling Graphical Models

Students(Pasula & Russell, 2001)

P professors S students Various relations, such as

famous(p), well-funded(p), success(s), advises(p,s)

Sample Query: probability a professor is well-funded given success of advised students 17,69323362,30206-24

9,20917633,45406-12

10,73414838,16805-20

5,62412820,68805-10

5,85910121,07004-16

3,0997211,56604-08

CNFVars

Treewidthw

Networkparams

Students-Profs

Page 17: Compiling Graphical Models

Ordering genes on a chromosome and determining distance between them

Useful for predicting and detecting diseases

Associating functionality of genes with their location on the chromosome

Gene 1

Gene 2

Gene 3

Genetic Linkage Analysis

Page 18: Compiling Graphical Models

Pedigrees + Phenotype + Genotype

Page 19: Compiling Graphical Models
Page 20: Compiling Graphical Models

DBNs from Speech Applications

Page 21: Compiling Graphical Models

Coding Networks

Page 22: Compiling Graphical Models

Tutorial Outline

Theoretical foundations Online query answering

algorithms Offline compilation algorithms Applications Concluding remarks

Page 23: Compiling Graphical Models

Theoretical Foundations

Graphical Model (Bayesian, Markov Networks):

Is a Multi-Linear Function (MLF) Compiled Model:

Is an Arithmetic Circuit (AC)

Compilation process: Factoring MLF into AC

Page 24: Compiling Graphical Models

Multi-Linear Functions Arithmetic Circuits

ababaababaababaababaf ||||

A B

**

* *

+

+ +

* * * *

a ab ba aab| ab | ab| ab |

Factoring

A Differential Approach to Inference in Bayesian NetworksJACM-03 (Darwiche)

Page 25: Compiling Graphical Models

Factoring Multi-linear Functions (MLFs)

a + ad + abd + abcdMLF:

*

+

*

a b dc 1

+

Arithmetic Circuit (AC)

An MLF has an exponential number of terms, yet it may be represented by an AC with polynomial size!

• A graphical model defines an MLF

• Evaluating the MLF for a given evidence gives the probability of evidence

• The inference problem can be formulated as factoring the MLF of a graphical model

Circuit Complexity: Size of smallest AC that computes the MLF

Page 26: Compiling Graphical Models

Pr(a) =Pr(a) = .03.03 + .27 = .3+ .27 = .3

false

false

B

.03

.27

A

.56

.14

truetrue

true

false

false

false

Pr(.)

false

true

Graphical Models as MLFs

Page 27: Compiling Graphical Models

Pr(~b) =Pr(~b) = .27.27 + .14 = .41+ .14 = .41

false

false

B

.03

.27

A

.56

.14

truetrue

true

false

false

false

false

true

Pr(.)

Graphical Models as MLFs

Page 28: Compiling Graphical Models

.03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b

false

false

B

.03

A

truetrue

true

false

false

false

false

true

.27

.14

.56

λa*λb * .03

λa*λ~b * .27

λ~a*λb * .56

λ~a*λ~b* .14

F(λ~a, λ~b, λa, λb) =

Pr(.)

Graphical Models as MLFs

Page 29: Compiling Graphical Models

=.03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b

F(λ~a, λ~b, λa, λb)

Pr(a,~b)= F(λ~a:0, λ~b:1, λa:1 , λb:0) = .27

Pr(a)= F(λ~a:0, λ~b:1, λa:1 , λb:1) = .03+.27

Page 30: Compiling Graphical Models

A

B

C

θb|a

θa

θc|a

A B C Pr(.)

a b c θa θb|a θc|a

a b ~c θa θb|a θ~c|a

a ~b c θa θ~b|a θc|a

a ~b ~c θa θ~b|a θ~c|a

. . . …

Page 31: Compiling Graphical Models

A

B

C

θb|a

θa

θc|a

A B C Pr(.)

a b c λa λb λc θa θb|a θc|a

a b ~c λa λb λ~c θa θb|a θ~c|a

a ~b c λa λ~b λc θa θ~b|a θc|a

a ~b ~c λa λ~b λ~c θa θ~b|a θ~c|a

. . . …

Page 32: Compiling Graphical Models

F = λa λb λc θa θb|a θc|a + λa λb λ~c θa θb|a θ~c|a + λa λ~b λc θa θ~b|a θc|a +

λa λ~b λ~c θa θ~b|a θ~c|a

….

A

B

C

Page 33: Compiling Graphical Models

F = λa λb λc λd θa θb|a θc|a θd|bc +

λa λb λc λ~d θa θb|a θc|a θ~d|bc +

….

A

B

C

D

Each term has 2n variables (n indicators, n parameters)

Each variable has degree one (multi-linear function)

θa

θb|a

θc|a

θd|bc

Page 34: Compiling Graphical Models

Multi-Linear Functions Arithmetic Circuits

ababaababaababaababaf ||||

A B

**

* *

+

+ +

* * * *

a ab ba aab| ab | ab| ab |

Factoring

Page 35: Compiling Graphical Models

Online Query Answering Complexity:

Time and space linear in the AC size

Queries: Probability of evidence, with

evidence flipping/fast retraction Variable and family marginals MPE: most probable explanation Sensitivity analysis (derivatives)

Page 36: Compiling Graphical Models

Evaluating the Polynomial

)Pr(..)()( eFeF

Page 37: Compiling Graphical Models

PR: Probability of Evidence

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Pr(e)

Page 38: Compiling Graphical Models

The Partial Derivatives

),Pr()( xXeeF

x

Page 39: Compiling Graphical Models

PR: Probability of Evidence Flips

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Pr(e)X

Page 40: Compiling Graphical Models

PR: Probability of Evidence Flips

Battery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

Pr(e-X,x)X

Page 41: Compiling Graphical Models

The Partial Derivatives

),,Pr()(|

| uxeeF

uxux

Page 42: Compiling Graphical Models

PR: Family MarginalsBattery Age Alternator Fan Belt

BatteryCharge Delivered

Battery Power

Starter

Radio Lights Engine Turn Over

Gas Gauge

Gas

Fuel Pump Fuel Line

Distributor

Spark Plugs

Engine Start

UX

Pr(e,x,u)

Page 43: Compiling Graphical Models

Multi-Linear Functions Arithmetic Circuits

ababaababaababaababaf ||||

A B

**

* *

+

+ +

* * * *

a ab ba aab| ab | ab| ab |

Factoring

Page 44: Compiling Graphical Models

* *

* *

+

+

+

* * * *

Circuit Evaluation and Differentiation: Marginals

.3 1 .1 1 .9 .8 1 .2 0 .7

.3

.3 .1 .9 .8 .2 0

1 1

.3 01

1 1

1 .3 .3 0 0 1

1 .3 .3 .03 .3 0 .27 0 .7 0

)Pr(a

a ab ba aab| ab | ab|ab |),Pr()(03. baa

f

b

)Pr()(7. aaf

a

Two passes only:

•probability of evidence (with evidence flipping)•Node marginals•Family marginals•Sensitivity

Page 45: Compiling Graphical Models

Efficient Eval/Diff Schemes

Assume alternating levels of +/* nodes, with one parent per *node

Method A: Two registers per +node (no registers for *nodes)

Method B: One register per node (use for values in upward pass, then override with derivatives in downward pass)

Method C: One register per node, one bit per *node

Page 46: Compiling Graphical Models

.3 1 .1 1 .9 .8 1 .2 0 .7

**

* *

m

m m

* * * *

.27

.3 .1 .9 .8 .2 0

.9 .8

.27 0

Circuit Optimization: MPE

)(aMPE

*

* *

a ab ba aab|ab | ab|

ab |

Page 47: Compiling Graphical Models

m

*

m

* *

Circuit Optimization: MPE

baMPE ,:

a ab |

a b

Page 48: Compiling Graphical Models

Custom Hardware for Evaluating ACs

Adharapurapu, Ercegovac (2004)

Page 49: Compiling Graphical Models

Offline Compilation

Factoring MLFs into ACs: Jointree: Embeds AC Variable Elimination: Trace is an AC Recursive Conditioning: Trace is an

AC

Reduction to Logic: CNF to d-DNNFcompilation

Page 50: Compiling Graphical Models

Compiling using Jointrees Classical Jointree Algorithm:

Convert model into jointree Jointree propagation (two-passes)

Modern interpretation: Jointree embeds an AC that factors MLF Jointree propagation is

evaluating/differentiating embedded AC

Page 51: Compiling Graphical Models

AB

A

A B

root

A Jointree Embeds an AC…

AC AD

AE

AB ba:ba:ba:ab:

Aa:a:

A a:a:B b: b:

Inward-pass evaluates circuitOutward-pass differentiates circuit[Hugin, Shenoy Shafer,…]

A Differential Semantics to Jointree AlgorithmsAIJ-04 (with James Park)

Page 52: Compiling Graphical Models

Efficient Eval/Diff Schemes

Assume alternating levels of +/* nodes, with one parent per *node

Method A: Two registers per +node (no registers for *nodes)

Method B: One register per node (use for values in upward pass, then override with derivatives in downward pass)

Method C: One register per node, one bit per *node

Page 53: Compiling Graphical Models

Jointree Flavors Shenoy-Shafer:

Method A

Hugin:Method B (looses information)

Zero-Conscious Hugin (new):Method C (best of A,B)

Page 54: Compiling Graphical Models

Compiling using Variable Elimination (VE) VE operates on factors:

Mappings from variable instantiations to real numbers

VE performs two operations on factors: Multiply two factors Sum-Out a variable from factor

Factors have different representations: Tables More structured representations (decision

trees/graphs) Overhead problem for structured factors

Page 55: Compiling Graphical Models

A B

true

false

A

.3

.7

TA

Tabular Factors

false

B

.1

.9

A

.8

.2

truetrue

true

false

false

false

TB

false

true

Page 56: Compiling Graphical Models

X

Z

.1 .9

Y

.5

Z

Structured Factors:Algebraic Decision Diagrams (ADDs)

Page 57: Compiling Graphical Models

NetworkMax

Clust Vars Card Total Parms %Det %Distinct

alarm 7.2 37 2...4 752 0.9 24.6

bm 20 1005 2...2 6972 99.6 100

diabetes 17.2 413 3...21 461069 78.2 17.6

hailfinder 11.7 56 2...11 3741 15.7 26.9

mildew 21.4 35 3...100 547158 93.2 25.1

mm 23 1220 2...2 8326 98.7 75

munin1 26.8 189 1...21 19466 66.5 61.2

munin2 18.6 1003 2...21 83920 63.3 69.5

munin3 17.8 1044 1...21 85855 63.1 71.3

munin4 21.4 1041 1...21 98183 64.5 65.3

pathfinder 15 109 2...63 97851 56.1 5.1

pigs 17.4 441 3...3 8427 56.2 23.9

students 22 376 2...2 2616 90.7 79.3

tcc4f 10 105 2...2 3236 0.4 35.6

water 19.9 32 3...4 13484 54 57

Networks with Local Structure

Page 58: Compiling Graphical Models

VE: Tabular vs ADD Representations of Factors

Tabular ADD

Network Time (ms) Time (ms) Improvement

alarm 31 360 0.086

barley 307 14,049 0.022

bm-5-3 4,892 658 7.435

diabetes 949 33,220 0.029

hailfinder 48 515 0.093

link 1,688 2,658 0.635

mm-3-8-3 2,166 843 2.569

mildew 72 92,602 0.001

munin1 155 1,255 0.124

munin2 204 3,170 0.064

munin3 350 5,049 0.069

munin4 406 4,361 0.093

pathfinder 51 5,213 0.01

pigs 69 597 0.116

st-3-2 186 362 0.514

tcc4f 29 153 0.19

water 76 1,015 0.075

Page 59: Compiling Graphical Models

Compiling using Variable Elimination (VE) By using symbolic factors and

corresponding operations: VE compiles out an AC

VE with tabular factors: Generates ACs similar to those

embedded in jointree

VE with structured factors: Generates much smaller ACs Overhead pushed into offline phase

Page 60: Compiling Graphical Models

A B

true

false

A

.3

.7

TA

Factors

false

B

.1

.9

A

.8

.2

truetrue

true

false

false

false

TB

false

true

Page 61: Compiling Graphical Models

A B

true

false

A TA

θa * λa

θ~a * λ~a

false

BA

truetrue

true

false

false

false

TB

false

true

θ~b|a * λ~b

θb|~a * λb

θb|a * λb

θ~b|~a* λ~b

Symbolic Factors

Page 62: Compiling Graphical Models

true

false

A T’B

θb|a *λb + θ~b|a* λ~b

θb|~a*λb+θ~b|~a *λ~bfalse

BA

truetrue

true

false

false

false

TB

false

true

θ~b|a * λ~b

θb|~a * λb

θb|a * λb

θ~b|~a * λ~b

Summing out B

Summing out Variable B

Page 63: Compiling Graphical Models

* =

Multiplying Factors

true

false

A TA T’B

θa *λa *(θb|a* λb + θ~b|a* λ~b)

θ~a*λ~a*(θb|~a*λb + θ~b|~a*λ~b)

true

false

A T’B

θb|a*λb + θ~b|a*λ~b

θb|~a*λb + θ~b|~a*λ~b

true

false

A TA

θa*λa

θ~a*λ~a

Page 64: Compiling Graphical Models

θa * λa* (θb|a* λb + θ~b|a* λ~b) + θ~a* λ~a (θb|~a* λb + θ~b|~a* λ~b)

true

false

A TA T’B

θa * λa * (θb|a * λb + θ~b|a * λ~b)

θ~a * λ~a* (θb|~a* λb + θ~b|~a* λ~b)

Summing out Variable A

Page 65: Compiling Graphical Models

VE factors MLF into AC(Bottom up Construction)

ababaababaababaababaf ||||

A B

**

* *

+

+ +

* * * *

a ab ba aab| ab | ab| ab |

Factoring

•Time and space complexity of generating AC is similar to Variable Elimination: Exponential only in treewidth

•Generated ACs similar to those embedded in Jointree

•Recall: AC can be used to answer multiple queries!

Page 66: Compiling Graphical Models

X

Z

.1 .9

Y

.5

Z

Structured Factors:Algebraic Decision Diagrams (ADDs)

Page 67: Compiling Graphical Models

X

Z Y

Z

Structured Factors:Algebraic Decision Diagrams (ADDs)

1 2 3

Symbolic ADD

•Modify standard ADD operations (multiply, sum-out) to operate on symbolic ADDs

•Run variable elimination with symbolic ADDs

•Compile out an AC

•Asymptotic complexity is no worse than variable elimination

•Overhead of ADDs is pushed into offline phase

•Generated AC can be much smaller

•Online inference can be much faster

Page 68: Compiling Graphical Models

NetworkMax

Clust Vars Card Total Parms %Det %Distinct

alarm 7.2 37 2...4 752 0.9 24.6

bm 20 1005 2...2 6972 99.6 100

diabetes 17.2 413 3...21 461069 78.2 17.6

hailfinder 11.7 56 2...11 3741 15.7 26.9

mildew 21.4 35 3...100 547158 93.2 25.1

mm 23 1220 2...2 8326 98.7 75

munin1 26.8 189 1...21 19466 66.5 61.2

munin2 18.6 1003 2...21 83920 63.3 69.5

munin3 17.8 1044 1...21 85855 63.1 71.3

munin4 21.4 1041 1...21 98183 64.5 65.3

pathfinder 15 109 2...63 97851 56.1 5.1

pigs 17.4 441 3...3 8427 56.2 23.9

students 22 376 2...2 2616 90.7 79.3

tcc4f 10 105 2...2 3236 0.4 35.6

water 19.9 32 3...4 13484 54 57

Networks with Local Structure

Page 69: Compiling Graphical Models

Tabular ADD

Network Time (ms) Time (ms) Improvement

alarm 31 360 0.086

barley 307 14,049 0.022

bm-5-3 4,892 658 7.435

diabetes 949 33,220 0.029

hailfinder 48 515 0.093

link 1,688 2,658 0.635

mm-3-8-3 2,166 843 2.569

mildew 72 92,602 0.001

munin1 155 1,255 0.124

munin2 204 3,170 0.064

munin3 350 5,049 0.069

munin4 406 4,361 0.093

pathfinder 51 5,213 0.01

pigs 69 597 0.116

st-3-2 186 362 0.514

tcc4f 29 153 0.19

water 76 1,015 0.075

Tabular vs ADD: Standard VE

Page 70: Compiling Graphical Models

Time (s) AC size

Network Ace ADD-VE Improv. Tabular-VE ADD-VE Improv.

alarm 0.3 3.9 0.1 3,534 3,030 1.2

barley 8,190.20 122.8 66.7 66,467,777 24,653,744 2.7

bm-5-3 0.8 6 0.1 75,591,750 14,836 5095.2

diabetes 1,710.00 110.3 15.5 34,728,957 17,219,042 2

hailfinder 0.7 1.2 0.5 72,755 25,992 2.8

link - 699.7 - 127,262,777 89,097,450 1.4

mildew 3,125.20 218.9 14.3 16,094,592 3,352,330 4.8

mm-3-8-3 1.5 11.9 0.1 36,635,566 108,428 337.9

munin1 1,005.10 316.7 3.21,260,407,1

23 31,409,970 40.1

munin2 198.4 31.7 6.3 20,295,426 5,662,218 3.6

munin3 188.4 17.6 10.7 16,987,088 3,503,242 4.8

munin4 205 37.8 5.4 76,028,532 6,869,760 11.1

pathfinder 4.9 5.8 0.9 796,588 44,468 17.9

pigs 23.1 10 2.3 4,925,388 2,558,680 1.9

st-3-2 0.5 2.4 0.2 19,374,934 22,070 877.9

tcc4f 0.9 1.1 0.8 33,408 22,612 1.5

water 3 20.7 0.1 15,996,054 170,428 93.9

Tabular vs ADD: VE Compilations

Page 71: Compiling Graphical Models

Network Jointree ADD-VE Improv.

alarm 166 32 5.2

barley 65,226 35,209 1.9

bm-5-3 89,593 83 1079.4

diabetes 29,316 20,421 1.4

hailfinder 245 70 3.5

link 223,542 175,769 1.3

mildew 10,077 4,522 2.2

mm-3-8-3 34,001 198 171.7

munin1 669,915 37,451 17.9

munin2 17,857 7,180 2.5

munin3 13,351 4,945 2.7

munin4 42,754 8,683 4.9

pathfinder 1,332 102 13.1

pigs 3,020 2,814 1.1

st-3-2 17,536 82 213.9

tcc4f 281 73 3.8

water 16,676 251 66.4

ADD-VE vs Jointree: Online Inference Time (ms)

Computing all marginals, for 16 pieces of random evidence

Work on structured representations of factors is now muchmore relevant and practical.

Page 72: Compiling Graphical Models

Compiling by Reduction to Logic Algebraic: MLFs / ACs Logical: CNF / d-DNNF

Factoring MLF into AC can be reducedto factoring CNF into d-DNNF

CNF to d-DNNF compilers are very powerful (natural for exploiting determinism and CSI)

Page 73: Compiling Graphical Models

Compiler:http://reasoning.cs.ucla.edu/c2d

d-DNNFd-DNNFCNFCNF

Multi-Linear Function

ArithmeticCircuit

Encode Decode

Reduction to Logic

Page 74: Compiling Graphical Models

a c + a b c + cMulti-linear function:Propositional theory:

c ^ (a b) Encode

c

b 1

a 1Arithmetic Circuit

Decode

c

b b

a aSmooth d-DNNF

Compile

MLFsACsCNFsd-DNNF

Page 75: Compiling Graphical Models

or

and

A

and

Aand and

or

and

B

C

or

and

D

E

or or

B D

and and

Deterministic, , Decomposable NNF

Page 76: Compiling Graphical Models

or

and

A

and

Aand and

or

and

B

C

or

and

D

E

or or

B D

and and

Deterministic, , Decomposable NNF

Deterministic:Disjuncts are logically disjoint

Page 77: Compiling Graphical Models

or

and

A

and

Aand and

or

and

B

C

or

and

D

E

or or

B D

and and

Deterministic, Decomposable NNF

B

C

BD

E

D

Decomposable:Conjuncts share no variables

Compiling CNFs into d-DNNFsAAAI-02, ECAI-04

Compiler at http://reasoning.cs.ucla.edu/c2d

Page 78: Compiling Graphical Models

A B C A B CA D E A D E

Recursive Conditioning for Compilation

or

B CD E

D EB C

A

and

and

and

andA

B C D E

B C D E

Page 79: Compiling Graphical Models

Why Logic? Encoding local structure is easy:

Determinism encoded by adding clauses:

CSI encoded by collapsing variables:

A natural environment to exploit local structure:

DD-backtracking, clause learning, … Non-structural decomposition Non-structural (formula) caching

0| AC

BACABC ||

Page 80: Compiling Graphical Models

A B C

S

0.95

c

a b c

A Pr(S|A,B,C)B C

a

a

a

a

a

a

a

b

b

b

b

b

b

b

c

c

c

c

c

c

0.95

0.20

0.05

0.00

0.00

0.00

0.00

Tabular CPT

-Functional constraints-Context-specific independence

s|abe

Local Structure

Page 81: Compiling Graphical Models

0.95

c

a b c

A Pr(S|A,B,E)B C

a

a

a

a

a

a

a

b

b

b

b

b

b

b

c

c

c

c

c

c

0.95

0.20

0.05

0.00

0.00

0.00

0.00

Tabular CPT

λ~a λb λc λs ↔↔ θs|~abc

¬ λ~a ¬ λb ¬ λc ¬ λs

Determinism

Page 82: Compiling Graphical Models

0.95

c

a b c

A Pr(S|A,B,C)B C

a

a

a

a

a

a

a

b

b

b

b

b

b

b

c

c

c

c

c

c

0.95

0.20

0.05

0.00

0.00

0.00

0.00

Tabular CPT

λa λb λs ↔↔ θs|ab

λa λb λc λs ↔↔ θs|abc

λa λb λ~c λs ↔↔ θs|ab~c

Context-Specific Independence

Page 83: Compiling Graphical Models

X

Y

Belief network

xx

xx

x yx|y

….

….

CNF Smooth d-DNNF

x y x|

yx

x y x|

y

Arithmetic Circuit

The Ace System:http://reasoning.cs.ucla.edu/ace

Page 84: Compiling Graphical Models

Time (s) AC size

Network Ace ADD-VE Improv. Tabular-VE ADD-VE Improv.

alarm 0.3 3.9 0.1 3,534 3,030 1.2

barley 8,190.20 122.8 66.7 66,467,777 24,653,744 2.7

bm-5-3 0.8 6 0.1 75,591,750 14,836 5095.2

diabetes 1,710.00 110.3 15.5 34,728,957 17,219,042 2

hailfinder 0.7 1.2 0.5 72,755 25,992 2.8

link - 699.7 - 127,262,777 89,097,450 1.4

mildew 3,125.20 218.9 14.3 16,094,592 3,352,330 4.8

mm-3-8-3 1.5 11.9 0.1 36,635,566 108,428 337.9

munin1 1,005.10 316.7 3.21,260,407,1

23 31,409,970 40.1

munin2 198.4 31.7 6.3 20,295,426 5,662,218 3.6

munin3 188.4 17.6 10.7 16,987,088 3,503,242 4.8

munin4 205 37.8 5.4 76,028,532 6,869,760 11.1

pathfinder 4.9 5.8 0.9 796,588 44,468 17.9

pigs 23.1 10 2.3 4,925,388 2,558,680 1.9

st-3-2 0.5 2.4 0.2 19,374,934 22,070 877.9

tcc4f 0.9 1.1 0.8 33,408 22,612 1.5

water 3 20.7 0.1 15,996,054 170,428 93.9

ADD-VE vs Logic (Ace): Compile Times

Page 85: Compiling Graphical Models

Network Nodes Parameters Max Cluster

mastermind_04_08_03 1418 9802 26

mastermind_06_08_03 1814 12754 37

mastermind_10_08_03 2606 18658 54

mastermind_03_08_04 2288 16008 31

mastermind_04_08_04 2616 18488 39

mastermind_03_08_05 3692 26186 40

students_03_02 376 2616 25

students_03_12 1346 9856 59

students_04_16 2827 21070 101

students_05_20 5064 38168 148

students_06_24 8201 62302 233

blockmap_05_03 1005 6972 23

blockmap_10_03 6848 48758 52

blockmap_15_03 18787 132436 68

blockmap_20_03 43356 307220 92

blockmap_22_03 59404 423452 104

ADD-VE vs Logic (Ace)

Page 86: Compiling Graphical Models

NetworkOffline Time

(min)AC Nodes AC Edges

Online Inference Time (s)

mastermind_04_08_03 1 71,666 541,356 0.05

mastermind_06_08_03 1 258,228 1,523,888 0.15

mastermind_10_08_03 3 1,293,323 4,315,566 0.68

mastermind_03_08_04 2 186,351 4,859,201 0.3

mastermind_04_08_04 5 932,355 19,457,308 1.73

mastermind_03_08_05 10 1,359,391 55,417,639 4.33

students_03_02 1 7,927 37,281 0.01

students_03_12 1 24,219 113,876 0.02

students_04_16 3 181,166 815,461 0.09

students_05_20 7 1,319,834 5,236,257 1.84

students_06_24 33 9,922,233 36,450,231 12.97

blockmap_05_03 1 2,833 20,636 0.01

blockmap_10_03 2 17,749 974,817 0.06

blockmap_15_03 6 47,475 7,643,307 0.38

blockmap_20_03 30 105,602 40,172,434 2.45

blockmap_22_03 61 144,136 76,649,302 4.67

ADD-VE vs Logic (Ace)

Page 87: Compiling Graphical Models

Effect of Local Structure

Local StructureEncoded

Pathfinder

Water Munin4

None 981,178 13,777,166

116,136,985

Det + CSI

42,810(4%)

134,140(1%)

5,762,690(5%)

Det 130,380(13%)

138,501(1%)

9,997,267(9%)

CSI 200,787(20%)

11,111,104(81%)

17,612,036(15%)

Page 88: Compiling Graphical Models

Compilation vs Direct Inference

Grid problems here…

Page 89: Compiling Graphical Models

Compilation vs Direct Inference

Gridsize

Treewidth w

Det Cachet(sec)

Aceoffline(sec)

Aceonline(sec)

Offline/Online

16x16

25 50% 2236 220 2.072

1079

22x22

36 75% 2757 349 2.178

2024

34x34

60 90% 1584 79 0.419

3783Average over 10 random instances for each grid

Ace available at http://reasoning.cs.ucla.edu/ace

Page 90: Compiling Graphical Models

Applications

Relational Models Diagnosis Genetic Linkage Analysis

Page 91: Compiling Graphical Models

burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}

burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}

Primula/Ace: Upcoming Release

Page 92: Compiling Graphical Models
Page 93: Compiling Graphical Models
Page 94: Compiling Graphical Models
Page 95: Compiling Graphical Models
Page 96: Compiling Graphical Models

Friends and Smokers (Richardson & Domingos, 2004)

M individuals Relations such as

smokes(p), cancer(p), friend(p1,p2)

Logical constraints such as: if one of p's friends smokes, then p smokes.

Sample Query: probability that given person has cancer

Page 97: Compiling Graphical Models

Friends & SmokersM Networ

kparams

Treewidth w

CNFVars

CnfClauses

ACEdges

OnlineTime (sec)

OfflineTime(sec)

1 34 3 12 31 18 0 0.03

4 1,552 13 414 1,390 293 0.003 0.44

7 7,714 36 1,995 6,916 1,295 0.006 1.92

10

21,760 70 5,565 19,525 3,512 0.005 6.66

13

46,930 118 11,934 42,133 7,430 0.013 12.8

16

86,464 172 21,912 77,656 13,535 0.022 21.68

19

143,602 244 36,309 129,010 22,313 0.035 38.36

22

221,584 316 55,935 199,111 34,250 0.058 90.67

25

323,650 412 81,600 290,875 49,832 0.079 162.45

28

453,040 528 114,114

407,218 69,545 0.114 274.2

29

502,802 560 126,614

451,965 77,118 0.119 275.17

Page 98: Compiling Graphical Models

Students(Pasula & Russell, 2001)

P professors S students Various relataios, such as

famous(p), well-funded(p), success(s), advises(p,s)

Sample Query: probability a professor is well-funded given success of advised students

Page 99: Compiling Graphical Models

Students

Students-Profs

Networkparams

Treewidthw

CNFVars

CnfClauses

ACEdges

OnlineTime (sec)

OfflineTime(min)

04-08 11,566 72 3,099 11,099 445,410 0.0530 2

04-16 21,070 101 5,859 21,115 815,461 0.0930 3

05-10 20,688 128 5,624 20,279 2,531,230 0.2885 3

05-20 38,168 148 10,734

38,889 5,236,257 1.8439 7

06-12 33,454 176 9,209 33,353 16,936,504

3.2120 14

06-24 62,302 233 17,693

64,325 36,450,231

12.9663

33

Page 100: Compiling Graphical Models

Diagnosis QMR-like: Effect of Encoding Evidence

600 diseases (D) and 4100 features (F)

Feature Fj is a noisy-or of parent diseases Di

(11 parents chosen randomly)

Sample Query: probability of disease given partial evidence on features.

D1 D2 D3 Dm…

F1 F2 Fn…

Page 101: Compiling Graphical Models

Treewidth: 586-589

CNF variables: 94,900

CNF clauses: 188,600

No. TrueFeatures

ACEdges

OnlineTime (sec)

OfflineTime (sec)

0 48,100 0.05 23.73

3 52,830 0.05 23.86

6 57,638 0.05 23.81

9 62,547 0.05 23.82

12 67,632 0.05 24.19

15 73,321 0.04 23.6

18 81,629 0.05 24.95

21 109,335 0.05 30.95

25 434,445 0.08 155.12

27 1,141,674

0.17 469.7

28 1,691,833

0.23 728.52

29 2,352,820

0.3 1,046.93

Diagnosis QMR-like: Effect of Encoding Evidence

Page 102: Compiling Graphical Models

Ordering genes on a chromosome and determining distance between them

Useful for predicting and detecting diseases

Associating functionality of genes with their location on the chromosome

Gene 1

Gene 2

Gene 3

Genetic Linkage Analysis

Page 103: Compiling Graphical Models

Pedigrees + Phenotypes + Genotypes

Page 104: Compiling Graphical Models
Page 105: Compiling Graphical Models

Arithmetic Circuit

Gene 1

Gene 2

Page 106: Compiling Graphical Models

State of the Art Linkage

Pedigree

Offline(sec)

AC Edges Online (sec)

Superlink 1.4(sec)

EE33 25.33 2,070,707

0.59 1,046.72

EE37 61.29 1,855,410

0.39 1,381.61

EE30 376.78

27,997,686

8.37 815.33

EE23 89.47 3,986,816

1.08 502.02

EE18 283.96

23,632,200

6.63 248.11

Page 107: Compiling Graphical Models

Model Compilation: Factoring MLFs into ACs

Classical algorithms factor MLFs into ACs:

Jointree embeds AC Variable elimination

constructs AC bottom up Recursive conditioning

constructs ACtop down

Factoring MLFs into ACs can be reduced to logical reasoning

Exploiting local structure to build smaller ACs:

Compiling models with very high treewidth is common place

Boundary between exact and approximate inference is much changed

Public systems now available!

Page 108: Compiling Graphical Models
Page 109: Compiling Graphical Models