compiling graphical models
DESCRIPTION
Compiling Graphical Models. Adnan Darwiche University of California, Los Angeles UAI’06 Tutorial. Compilation: Historical Motivation. Separate inference into two phases: Offline : Compile model into a structure Online : Use structure to answer queries - PowerPoint PPT PresentationTRANSCRIPT
Compiling Graphical Models
Adnan DarwicheUniversity of California, Los
Angeles
UAI’06 Tutorial
Compilation: Historical Motivation
Separate inference into two phases: Offline: Compile model into a structure Online: Use structure to answer queries
Goal: Push as much work into offline phase to optimize online inference time
Best initial example: Offline: Compile a Bayesian network into a jointree Online: Use jointree to answer multiple queries
efficiently
Compilation: Modern Motivation
Exploit model structure in inference: Global structure:
Exhibited in model topology Measured by treewidth Exploited by most (non-compilation) algorithms
Local structure: Exhibited in model parameters Type 1: Determinism Type 2: Context-specific independence
Local structure is best exploited in the context of compilation: main theme
Compilation: Theoretical Implications
Unifies inference paradigms Variable elimination Jointree (Tree clustering) Conditioning
Compilation as a trace of classical inference
Bayesian Networks
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Local Knowledge
Bayesian Networks
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
ON OFF
OK
WEAK
DEAD
Lights
Batt
ery
P
ow
er .99 .01
.20 .80
0 1
If Battery Power = OK, then Lights = ON (99%)
….
Bayesian Networks
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Global Structure:Treewidth w
))exp(( wnO
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Local Structure:CSI and Determinism
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Context Specific Independence (CSI)
Local Structure:CSI and Determinism
Local Structure:CSI and Determinism
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
ON OFF
OK
WEAK
DEAD
Lights
Batt
ery
P
ow
er .99 .01
.20 .80
0 1
If Battery Power = Dead,
then Lights = OFF
Determinism
Today’s Models …
Characterized by: Richness in local structure (determinism, CSI) Massiveness in size (100,000’s variables not
uncommon) High connectivity (treewidth > 50, > 100)
Enabled by: High level modeling tools: relational, first order New application areas (synthesis):
Bioinformatics (e.g. linkage analysis) Sensor networks
Exploiting local structure a must!
High Order Specifications:Relational Models…
burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}
burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}
Primula
Friends and Smokers (Richardson & Domingos, 2004)
M individuals Relations such as
smokes(p), cancer(p), friend(p1,p2)
Logical constraints such as: if one of p's friends smokes, then p smokes.
Sample Query: probability that given person has cancer
77,65621,91217286,46416
42,13311,93411846,93013
19,5255,5657021,76010
6,9161,995367,7147
1,390414131,5524
31123341
451,965126,614560502,80229
407,218114,114528453,04028
290,87581,600412323,65025
199,11155,935316221,58422
129,01036,309244143,60219
CnfClauses
CNFVars
Treewidth*
w
Networkparams
M
Students(Pasula & Russell, 2001)
P professors S students Various relations, such as
famous(p), well-funded(p), success(s), advises(p,s)
Sample Query: probability a professor is well-funded given success of advised students 17,69323362,30206-24
9,20917633,45406-12
10,73414838,16805-20
5,62412820,68805-10
5,85910121,07004-16
3,0997211,56604-08
CNFVars
Treewidthw
Networkparams
Students-Profs
Ordering genes on a chromosome and determining distance between them
Useful for predicting and detecting diseases
Associating functionality of genes with their location on the chromosome
Gene 1
Gene 2
Gene 3
Genetic Linkage Analysis
Pedigrees + Phenotype + Genotype
DBNs from Speech Applications
Coding Networks
Tutorial Outline
Theoretical foundations Online query answering
algorithms Offline compilation algorithms Applications Concluding remarks
Theoretical Foundations
Graphical Model (Bayesian, Markov Networks):
Is a Multi-Linear Function (MLF) Compiled Model:
Is an Arithmetic Circuit (AC)
Compilation process: Factoring MLF into AC
Multi-Linear Functions Arithmetic Circuits
ababaababaababaababaf ||||
A B
**
* *
+
+ +
* * * *
a ab ba aab| ab | ab| ab |
Factoring
A Differential Approach to Inference in Bayesian NetworksJACM-03 (Darwiche)
Factoring Multi-linear Functions (MLFs)
a + ad + abd + abcdMLF:
*
+
*
a b dc 1
+
Arithmetic Circuit (AC)
An MLF has an exponential number of terms, yet it may be represented by an AC with polynomial size!
• A graphical model defines an MLF
• Evaluating the MLF for a given evidence gives the probability of evidence
• The inference problem can be formulated as factoring the MLF of a graphical model
Circuit Complexity: Size of smallest AC that computes the MLF
Pr(a) =Pr(a) = .03.03 + .27 = .3+ .27 = .3
false
false
B
.03
.27
A
.56
.14
truetrue
true
false
false
false
Pr(.)
false
true
Graphical Models as MLFs
Pr(~b) =Pr(~b) = .27.27 + .14 = .41+ .14 = .41
false
false
B
.03
.27
A
.56
.14
truetrue
true
false
false
false
false
true
Pr(.)
Graphical Models as MLFs
.03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b
false
false
B
.03
A
truetrue
true
false
false
false
false
true
.27
.14
.56
λa*λb * .03
λa*λ~b * .27
λ~a*λb * .56
λ~a*λ~b* .14
F(λ~a, λ~b, λa, λb) =
Pr(.)
Graphical Models as MLFs
=.03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b
F(λ~a, λ~b, λa, λb)
Pr(a,~b)= F(λ~a:0, λ~b:1, λa:1 , λb:0) = .27
Pr(a)= F(λ~a:0, λ~b:1, λa:1 , λb:1) = .03+.27
A
B
C
θb|a
θa
θc|a
A B C Pr(.)
a b c θa θb|a θc|a
a b ~c θa θb|a θ~c|a
a ~b c θa θ~b|a θc|a
a ~b ~c θa θ~b|a θ~c|a
. . . …
A
B
C
θb|a
θa
θc|a
A B C Pr(.)
a b c λa λb λc θa θb|a θc|a
a b ~c λa λb λ~c θa θb|a θ~c|a
a ~b c λa λ~b λc θa θ~b|a θc|a
a ~b ~c λa λ~b λ~c θa θ~b|a θ~c|a
. . . …
F = λa λb λc θa θb|a θc|a + λa λb λ~c θa θb|a θ~c|a + λa λ~b λc θa θ~b|a θc|a +
λa λ~b λ~c θa θ~b|a θ~c|a
….
A
B
C
F = λa λb λc λd θa θb|a θc|a θd|bc +
λa λb λc λ~d θa θb|a θc|a θ~d|bc +
….
A
B
C
D
Each term has 2n variables (n indicators, n parameters)
Each variable has degree one (multi-linear function)
θa
θb|a
θc|a
θd|bc
Multi-Linear Functions Arithmetic Circuits
ababaababaababaababaf ||||
A B
**
* *
+
+ +
* * * *
a ab ba aab| ab | ab| ab |
Factoring
Online Query Answering Complexity:
Time and space linear in the AC size
Queries: Probability of evidence, with
evidence flipping/fast retraction Variable and family marginals MPE: most probable explanation Sensitivity analysis (derivatives)
Evaluating the Polynomial
)Pr(..)()( eFeF
PR: Probability of Evidence
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Pr(e)
The Partial Derivatives
),Pr()( xXeeF
x
PR: Probability of Evidence Flips
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Pr(e)X
PR: Probability of Evidence Flips
Battery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
Pr(e-X,x)X
The Partial Derivatives
),,Pr()(|
| uxeeF
uxux
PR: Family MarginalsBattery Age Alternator Fan Belt
BatteryCharge Delivered
Battery Power
Starter
Radio Lights Engine Turn Over
Gas Gauge
Gas
Fuel Pump Fuel Line
Distributor
Spark Plugs
Engine Start
UX
Pr(e,x,u)
Multi-Linear Functions Arithmetic Circuits
ababaababaababaababaf ||||
A B
**
* *
+
+ +
* * * *
a ab ba aab| ab | ab| ab |
Factoring
* *
* *
+
+
+
* * * *
Circuit Evaluation and Differentiation: Marginals
.3 1 .1 1 .9 .8 1 .2 0 .7
.3
.3 .1 .9 .8 .2 0
1 1
.3 01
1 1
1 .3 .3 0 0 1
1 .3 .3 .03 .3 0 .27 0 .7 0
)Pr(a
a ab ba aab| ab | ab|ab |),Pr()(03. baa
f
b
)Pr()(7. aaf
a
Two passes only:
•probability of evidence (with evidence flipping)•Node marginals•Family marginals•Sensitivity
Efficient Eval/Diff Schemes
Assume alternating levels of +/* nodes, with one parent per *node
Method A: Two registers per +node (no registers for *nodes)
Method B: One register per node (use for values in upward pass, then override with derivatives in downward pass)
Method C: One register per node, one bit per *node
.3 1 .1 1 .9 .8 1 .2 0 .7
**
* *
m
m m
* * * *
.27
.3 .1 .9 .8 .2 0
.9 .8
.27 0
Circuit Optimization: MPE
)(aMPE
*
* *
a ab ba aab|ab | ab|
ab |
m
*
m
* *
Circuit Optimization: MPE
baMPE ,:
a ab |
a b
Custom Hardware for Evaluating ACs
Adharapurapu, Ercegovac (2004)
Offline Compilation
Factoring MLFs into ACs: Jointree: Embeds AC Variable Elimination: Trace is an AC Recursive Conditioning: Trace is an
AC
Reduction to Logic: CNF to d-DNNFcompilation
Compiling using Jointrees Classical Jointree Algorithm:
Convert model into jointree Jointree propagation (two-passes)
Modern interpretation: Jointree embeds an AC that factors MLF Jointree propagation is
evaluating/differentiating embedded AC
AB
A
A B
root
A Jointree Embeds an AC…
AC AD
AE
AB ba:ba:ba:ab:
Aa:a:
A a:a:B b: b:
Inward-pass evaluates circuitOutward-pass differentiates circuit[Hugin, Shenoy Shafer,…]
A Differential Semantics to Jointree AlgorithmsAIJ-04 (with James Park)
Efficient Eval/Diff Schemes
Assume alternating levels of +/* nodes, with one parent per *node
Method A: Two registers per +node (no registers for *nodes)
Method B: One register per node (use for values in upward pass, then override with derivatives in downward pass)
Method C: One register per node, one bit per *node
Jointree Flavors Shenoy-Shafer:
Method A
Hugin:Method B (looses information)
Zero-Conscious Hugin (new):Method C (best of A,B)
Compiling using Variable Elimination (VE) VE operates on factors:
Mappings from variable instantiations to real numbers
VE performs two operations on factors: Multiply two factors Sum-Out a variable from factor
Factors have different representations: Tables More structured representations (decision
trees/graphs) Overhead problem for structured factors
A B
true
false
A
.3
.7
TA
Tabular Factors
false
B
.1
.9
A
.8
.2
truetrue
true
false
false
false
TB
false
true
X
Z
.1 .9
Y
.5
Z
Structured Factors:Algebraic Decision Diagrams (ADDs)
NetworkMax
Clust Vars Card Total Parms %Det %Distinct
alarm 7.2 37 2...4 752 0.9 24.6
bm 20 1005 2...2 6972 99.6 100
diabetes 17.2 413 3...21 461069 78.2 17.6
hailfinder 11.7 56 2...11 3741 15.7 26.9
mildew 21.4 35 3...100 547158 93.2 25.1
mm 23 1220 2...2 8326 98.7 75
munin1 26.8 189 1...21 19466 66.5 61.2
munin2 18.6 1003 2...21 83920 63.3 69.5
munin3 17.8 1044 1...21 85855 63.1 71.3
munin4 21.4 1041 1...21 98183 64.5 65.3
pathfinder 15 109 2...63 97851 56.1 5.1
pigs 17.4 441 3...3 8427 56.2 23.9
students 22 376 2...2 2616 90.7 79.3
tcc4f 10 105 2...2 3236 0.4 35.6
water 19.9 32 3...4 13484 54 57
Networks with Local Structure
VE: Tabular vs ADD Representations of Factors
Tabular ADD
Network Time (ms) Time (ms) Improvement
alarm 31 360 0.086
barley 307 14,049 0.022
bm-5-3 4,892 658 7.435
diabetes 949 33,220 0.029
hailfinder 48 515 0.093
link 1,688 2,658 0.635
mm-3-8-3 2,166 843 2.569
mildew 72 92,602 0.001
munin1 155 1,255 0.124
munin2 204 3,170 0.064
munin3 350 5,049 0.069
munin4 406 4,361 0.093
pathfinder 51 5,213 0.01
pigs 69 597 0.116
st-3-2 186 362 0.514
tcc4f 29 153 0.19
water 76 1,015 0.075
Compiling using Variable Elimination (VE) By using symbolic factors and
corresponding operations: VE compiles out an AC
VE with tabular factors: Generates ACs similar to those
embedded in jointree
VE with structured factors: Generates much smaller ACs Overhead pushed into offline phase
A B
true
false
A
.3
.7
TA
Factors
false
B
.1
.9
A
.8
.2
truetrue
true
false
false
false
TB
false
true
A B
true
false
A TA
θa * λa
θ~a * λ~a
false
BA
truetrue
true
false
false
false
TB
false
true
θ~b|a * λ~b
θb|~a * λb
θb|a * λb
θ~b|~a* λ~b
Symbolic Factors
true
false
A T’B
θb|a *λb + θ~b|a* λ~b
θb|~a*λb+θ~b|~a *λ~bfalse
BA
truetrue
true
false
false
false
TB
false
true
θ~b|a * λ~b
θb|~a * λb
θb|a * λb
θ~b|~a * λ~b
Summing out B
Summing out Variable B
* =
Multiplying Factors
true
false
A TA T’B
θa *λa *(θb|a* λb + θ~b|a* λ~b)
θ~a*λ~a*(θb|~a*λb + θ~b|~a*λ~b)
true
false
A T’B
θb|a*λb + θ~b|a*λ~b
θb|~a*λb + θ~b|~a*λ~b
true
false
A TA
θa*λa
θ~a*λ~a
θa * λa* (θb|a* λb + θ~b|a* λ~b) + θ~a* λ~a (θb|~a* λb + θ~b|~a* λ~b)
true
false
A TA T’B
θa * λa * (θb|a * λb + θ~b|a * λ~b)
θ~a * λ~a* (θb|~a* λb + θ~b|~a* λ~b)
Summing out Variable A
VE factors MLF into AC(Bottom up Construction)
ababaababaababaababaf ||||
A B
**
* *
+
+ +
* * * *
a ab ba aab| ab | ab| ab |
Factoring
•Time and space complexity of generating AC is similar to Variable Elimination: Exponential only in treewidth
•Generated ACs similar to those embedded in Jointree
•Recall: AC can be used to answer multiple queries!
X
Z
.1 .9
Y
.5
Z
Structured Factors:Algebraic Decision Diagrams (ADDs)
X
Z Y
Z
Structured Factors:Algebraic Decision Diagrams (ADDs)
1 2 3
Symbolic ADD
•Modify standard ADD operations (multiply, sum-out) to operate on symbolic ADDs
•Run variable elimination with symbolic ADDs
•Compile out an AC
•Asymptotic complexity is no worse than variable elimination
•Overhead of ADDs is pushed into offline phase
•Generated AC can be much smaller
•Online inference can be much faster
NetworkMax
Clust Vars Card Total Parms %Det %Distinct
alarm 7.2 37 2...4 752 0.9 24.6
bm 20 1005 2...2 6972 99.6 100
diabetes 17.2 413 3...21 461069 78.2 17.6
hailfinder 11.7 56 2...11 3741 15.7 26.9
mildew 21.4 35 3...100 547158 93.2 25.1
mm 23 1220 2...2 8326 98.7 75
munin1 26.8 189 1...21 19466 66.5 61.2
munin2 18.6 1003 2...21 83920 63.3 69.5
munin3 17.8 1044 1...21 85855 63.1 71.3
munin4 21.4 1041 1...21 98183 64.5 65.3
pathfinder 15 109 2...63 97851 56.1 5.1
pigs 17.4 441 3...3 8427 56.2 23.9
students 22 376 2...2 2616 90.7 79.3
tcc4f 10 105 2...2 3236 0.4 35.6
water 19.9 32 3...4 13484 54 57
Networks with Local Structure
Tabular ADD
Network Time (ms) Time (ms) Improvement
alarm 31 360 0.086
barley 307 14,049 0.022
bm-5-3 4,892 658 7.435
diabetes 949 33,220 0.029
hailfinder 48 515 0.093
link 1,688 2,658 0.635
mm-3-8-3 2,166 843 2.569
mildew 72 92,602 0.001
munin1 155 1,255 0.124
munin2 204 3,170 0.064
munin3 350 5,049 0.069
munin4 406 4,361 0.093
pathfinder 51 5,213 0.01
pigs 69 597 0.116
st-3-2 186 362 0.514
tcc4f 29 153 0.19
water 76 1,015 0.075
Tabular vs ADD: Standard VE
Time (s) AC size
Network Ace ADD-VE Improv. Tabular-VE ADD-VE Improv.
alarm 0.3 3.9 0.1 3,534 3,030 1.2
barley 8,190.20 122.8 66.7 66,467,777 24,653,744 2.7
bm-5-3 0.8 6 0.1 75,591,750 14,836 5095.2
diabetes 1,710.00 110.3 15.5 34,728,957 17,219,042 2
hailfinder 0.7 1.2 0.5 72,755 25,992 2.8
link - 699.7 - 127,262,777 89,097,450 1.4
mildew 3,125.20 218.9 14.3 16,094,592 3,352,330 4.8
mm-3-8-3 1.5 11.9 0.1 36,635,566 108,428 337.9
munin1 1,005.10 316.7 3.21,260,407,1
23 31,409,970 40.1
munin2 198.4 31.7 6.3 20,295,426 5,662,218 3.6
munin3 188.4 17.6 10.7 16,987,088 3,503,242 4.8
munin4 205 37.8 5.4 76,028,532 6,869,760 11.1
pathfinder 4.9 5.8 0.9 796,588 44,468 17.9
pigs 23.1 10 2.3 4,925,388 2,558,680 1.9
st-3-2 0.5 2.4 0.2 19,374,934 22,070 877.9
tcc4f 0.9 1.1 0.8 33,408 22,612 1.5
water 3 20.7 0.1 15,996,054 170,428 93.9
Tabular vs ADD: VE Compilations
Network Jointree ADD-VE Improv.
alarm 166 32 5.2
barley 65,226 35,209 1.9
bm-5-3 89,593 83 1079.4
diabetes 29,316 20,421 1.4
hailfinder 245 70 3.5
link 223,542 175,769 1.3
mildew 10,077 4,522 2.2
mm-3-8-3 34,001 198 171.7
munin1 669,915 37,451 17.9
munin2 17,857 7,180 2.5
munin3 13,351 4,945 2.7
munin4 42,754 8,683 4.9
pathfinder 1,332 102 13.1
pigs 3,020 2,814 1.1
st-3-2 17,536 82 213.9
tcc4f 281 73 3.8
water 16,676 251 66.4
ADD-VE vs Jointree: Online Inference Time (ms)
Computing all marginals, for 16 pieces of random evidence
Work on structured representations of factors is now muchmore relevant and practical.
Compiling by Reduction to Logic Algebraic: MLFs / ACs Logical: CNF / d-DNNF
Factoring MLF into AC can be reducedto factoring CNF into d-DNNF
CNF to d-DNNF compilers are very powerful (natural for exploiting determinism and CSI)
Compiler:http://reasoning.cs.ucla.edu/c2d
d-DNNFd-DNNFCNFCNF
Multi-Linear Function
ArithmeticCircuit
Encode Decode
Reduction to Logic
a c + a b c + cMulti-linear function:Propositional theory:
c ^ (a b) Encode
c
b 1
a 1Arithmetic Circuit
Decode
c
b b
a aSmooth d-DNNF
Compile
MLFsACsCNFsd-DNNF
or
and
A
and
Aand and
or
and
B
C
or
and
D
E
or or
B D
and and
Deterministic, , Decomposable NNF
or
and
A
and
Aand and
or
and
B
C
or
and
D
E
or or
B D
and and
Deterministic, , Decomposable NNF
Deterministic:Disjuncts are logically disjoint
or
and
A
and
Aand and
or
and
B
C
or
and
D
E
or or
B D
and and
Deterministic, Decomposable NNF
B
C
BD
E
D
Decomposable:Conjuncts share no variables
Compiling CNFs into d-DNNFsAAAI-02, ECAI-04
Compiler at http://reasoning.cs.ucla.edu/c2d
A B C A B CA D E A D E
Recursive Conditioning for Compilation
or
B CD E
D EB C
A
and
and
and
andA
B C D E
B C D E
Why Logic? Encoding local structure is easy:
Determinism encoded by adding clauses:
CSI encoded by collapsing variables:
A natural environment to exploit local structure:
DD-backtracking, clause learning, … Non-structural decomposition Non-structural (formula) caching
0| AC
BACABC ||
A B C
S
0.95
c
a b c
A Pr(S|A,B,C)B C
a
a
a
a
a
a
a
b
b
b
b
b
b
b
c
c
c
c
c
c
0.95
0.20
0.05
0.00
0.00
0.00
0.00
Tabular CPT
-Functional constraints-Context-specific independence
s|abe
Local Structure
0.95
c
a b c
A Pr(S|A,B,E)B C
a
a
a
a
a
a
a
b
b
b
b
b
b
b
c
c
c
c
c
c
0.95
0.20
0.05
0.00
0.00
0.00
0.00
Tabular CPT
λ~a λb λc λs ↔↔ θs|~abc
¬ λ~a ¬ λb ¬ λc ¬ λs
Determinism
0.95
c
a b c
A Pr(S|A,B,C)B C
a
a
a
a
a
a
a
b
b
b
b
b
b
b
c
c
c
c
c
c
0.95
0.20
0.05
0.00
0.00
0.00
0.00
Tabular CPT
λa λb λs ↔↔ θs|ab
λa λb λc λs ↔↔ θs|abc
λa λb λ~c λs ↔↔ θs|ab~c
Context-Specific Independence
X
Y
Belief network
xx
xx
x yx|y
….
….
CNF Smooth d-DNNF
x y x|
yx
x y x|
y
Arithmetic Circuit
The Ace System:http://reasoning.cs.ucla.edu/ace
Time (s) AC size
Network Ace ADD-VE Improv. Tabular-VE ADD-VE Improv.
alarm 0.3 3.9 0.1 3,534 3,030 1.2
barley 8,190.20 122.8 66.7 66,467,777 24,653,744 2.7
bm-5-3 0.8 6 0.1 75,591,750 14,836 5095.2
diabetes 1,710.00 110.3 15.5 34,728,957 17,219,042 2
hailfinder 0.7 1.2 0.5 72,755 25,992 2.8
link - 699.7 - 127,262,777 89,097,450 1.4
mildew 3,125.20 218.9 14.3 16,094,592 3,352,330 4.8
mm-3-8-3 1.5 11.9 0.1 36,635,566 108,428 337.9
munin1 1,005.10 316.7 3.21,260,407,1
23 31,409,970 40.1
munin2 198.4 31.7 6.3 20,295,426 5,662,218 3.6
munin3 188.4 17.6 10.7 16,987,088 3,503,242 4.8
munin4 205 37.8 5.4 76,028,532 6,869,760 11.1
pathfinder 4.9 5.8 0.9 796,588 44,468 17.9
pigs 23.1 10 2.3 4,925,388 2,558,680 1.9
st-3-2 0.5 2.4 0.2 19,374,934 22,070 877.9
tcc4f 0.9 1.1 0.8 33,408 22,612 1.5
water 3 20.7 0.1 15,996,054 170,428 93.9
ADD-VE vs Logic (Ace): Compile Times
Network Nodes Parameters Max Cluster
mastermind_04_08_03 1418 9802 26
mastermind_06_08_03 1814 12754 37
mastermind_10_08_03 2606 18658 54
mastermind_03_08_04 2288 16008 31
mastermind_04_08_04 2616 18488 39
mastermind_03_08_05 3692 26186 40
students_03_02 376 2616 25
students_03_12 1346 9856 59
students_04_16 2827 21070 101
students_05_20 5064 38168 148
students_06_24 8201 62302 233
blockmap_05_03 1005 6972 23
blockmap_10_03 6848 48758 52
blockmap_15_03 18787 132436 68
blockmap_20_03 43356 307220 92
blockmap_22_03 59404 423452 104
ADD-VE vs Logic (Ace)
NetworkOffline Time
(min)AC Nodes AC Edges
Online Inference Time (s)
mastermind_04_08_03 1 71,666 541,356 0.05
mastermind_06_08_03 1 258,228 1,523,888 0.15
mastermind_10_08_03 3 1,293,323 4,315,566 0.68
mastermind_03_08_04 2 186,351 4,859,201 0.3
mastermind_04_08_04 5 932,355 19,457,308 1.73
mastermind_03_08_05 10 1,359,391 55,417,639 4.33
students_03_02 1 7,927 37,281 0.01
students_03_12 1 24,219 113,876 0.02
students_04_16 3 181,166 815,461 0.09
students_05_20 7 1,319,834 5,236,257 1.84
students_06_24 33 9,922,233 36,450,231 12.97
blockmap_05_03 1 2,833 20,636 0.01
blockmap_10_03 2 17,749 974,817 0.06
blockmap_15_03 6 47,475 7,643,307 0.38
blockmap_20_03 30 105,602 40,172,434 2.45
blockmap_22_03 61 144,136 76,649,302 4.67
ADD-VE vs Logic (Ace)
Effect of Local Structure
Local StructureEncoded
Pathfinder
Water Munin4
None 981,178 13,777,166
116,136,985
Det + CSI
42,810(4%)
134,140(1%)
5,762,690(5%)
Det 130,380(13%)
138,501(1%)
9,997,267(9%)
CSI 200,787(20%)
11,111,104(81%)
17,612,036(15%)
Compilation vs Direct Inference
Grid problems here…
Compilation vs Direct Inference
Gridsize
Treewidth w
Det Cachet(sec)
Aceoffline(sec)
Aceonline(sec)
Offline/Online
16x16
25 50% 2236 220 2.072
1079
22x22
36 75% 2757 349 2.178
2024
34x34
60 90% 1584 79 0.419
3783Average over 10 random instances for each grid
Ace available at http://reasoning.cs.ucla.edu/ace
Applications
Relational Models Diagnosis Genetic Linkage Analysis
burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}
burglary(v)=0.005;alarm(v)=(burglary(v):0.95,0.01);calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)}
Primula/Ace: Upcoming Release
Friends and Smokers (Richardson & Domingos, 2004)
M individuals Relations such as
smokes(p), cancer(p), friend(p1,p2)
Logical constraints such as: if one of p's friends smokes, then p smokes.
Sample Query: probability that given person has cancer
Friends & SmokersM Networ
kparams
Treewidth w
CNFVars
CnfClauses
ACEdges
OnlineTime (sec)
OfflineTime(sec)
1 34 3 12 31 18 0 0.03
4 1,552 13 414 1,390 293 0.003 0.44
7 7,714 36 1,995 6,916 1,295 0.006 1.92
10
21,760 70 5,565 19,525 3,512 0.005 6.66
13
46,930 118 11,934 42,133 7,430 0.013 12.8
16
86,464 172 21,912 77,656 13,535 0.022 21.68
19
143,602 244 36,309 129,010 22,313 0.035 38.36
22
221,584 316 55,935 199,111 34,250 0.058 90.67
25
323,650 412 81,600 290,875 49,832 0.079 162.45
28
453,040 528 114,114
407,218 69,545 0.114 274.2
29
502,802 560 126,614
451,965 77,118 0.119 275.17
Students(Pasula & Russell, 2001)
P professors S students Various relataios, such as
famous(p), well-funded(p), success(s), advises(p,s)
Sample Query: probability a professor is well-funded given success of advised students
Students
Students-Profs
Networkparams
Treewidthw
CNFVars
CnfClauses
ACEdges
OnlineTime (sec)
OfflineTime(min)
04-08 11,566 72 3,099 11,099 445,410 0.0530 2
04-16 21,070 101 5,859 21,115 815,461 0.0930 3
05-10 20,688 128 5,624 20,279 2,531,230 0.2885 3
05-20 38,168 148 10,734
38,889 5,236,257 1.8439 7
06-12 33,454 176 9,209 33,353 16,936,504
3.2120 14
06-24 62,302 233 17,693
64,325 36,450,231
12.9663
33
Diagnosis QMR-like: Effect of Encoding Evidence
600 diseases (D) and 4100 features (F)
Feature Fj is a noisy-or of parent diseases Di
(11 parents chosen randomly)
Sample Query: probability of disease given partial evidence on features.
D1 D2 D3 Dm…
F1 F2 Fn…
Treewidth: 586-589
CNF variables: 94,900
CNF clauses: 188,600
No. TrueFeatures
ACEdges
OnlineTime (sec)
OfflineTime (sec)
0 48,100 0.05 23.73
3 52,830 0.05 23.86
6 57,638 0.05 23.81
9 62,547 0.05 23.82
12 67,632 0.05 24.19
15 73,321 0.04 23.6
18 81,629 0.05 24.95
21 109,335 0.05 30.95
25 434,445 0.08 155.12
27 1,141,674
0.17 469.7
28 1,691,833
0.23 728.52
29 2,352,820
0.3 1,046.93
Diagnosis QMR-like: Effect of Encoding Evidence
Ordering genes on a chromosome and determining distance between them
Useful for predicting and detecting diseases
Associating functionality of genes with their location on the chromosome
Gene 1
Gene 2
Gene 3
Genetic Linkage Analysis
Pedigrees + Phenotypes + Genotypes
Arithmetic Circuit
Gene 1
Gene 2
State of the Art Linkage
Pedigree
Offline(sec)
AC Edges Online (sec)
Superlink 1.4(sec)
EE33 25.33 2,070,707
0.59 1,046.72
EE37 61.29 1,855,410
0.39 1,381.61
EE30 376.78
27,997,686
8.37 815.33
EE23 89.47 3,986,816
1.08 502.02
EE18 283.96
23,632,200
6.63 248.11
Model Compilation: Factoring MLFs into ACs
Classical algorithms factor MLFs into ACs:
Jointree embeds AC Variable elimination
constructs AC bottom up Recursive conditioning
constructs ACtop down
Factoring MLFs into ACs can be reduced to logical reasoning
Exploiting local structure to build smaller ACs:
Compiling models with very high treewidth is common place
Boundary between exact and approximate inference is much changed
Public systems now available!