knowledge representation & reasoning lecture #4 uiuc cs 498: section ea professor: eyal amir...
TRANSCRIPT
![Page 1: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/1.jpg)
Knowledge Representation & ReasoningLecture #4
UIUC CS 498: Section EA
Professor: Eyal Amir
Fall Semester 2005(Based on slides by Lise Getoor and Alvaro Cardenas (UMD) (in turn based on slides by Nir Friedman (Hebrew U)))
![Page 2: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/2.jpg)
Today and Next Class
1. Probabilistic graphical models
2. Treewidth methods:1. Variable elimination
2. Clique tree algorithm
3. Applications du jour: Sensor Networks
![Page 3: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/3.jpg)
Probabilistic Representation of Knowledge
• Knowledge that is deterministic– If there is rain, there are clouds:
Clouds v Rain
• Knowledge that includes uncertainty– If there are clouds, there is a chance for rain
• Probabilistic knowledge– If there are clouds, the rain has probability 0.3
Pr(Rain=True | Clouds=True)=0.3
• How do we write probabilistic knowledge?
![Page 4: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/4.jpg)
How to Represent Probabilistic Knowledge?
• A probability distribution is a function from measurable sets of events to [0,1]
Pr : Pow() [0,1]– Example: domain = {T,F} x {T,F}
Random variables: Rain, Clouds• If the domain is discrete, Pr specifies
probabilities for all 2|| sets• But, it is enough to represent only || values:
Pr(a1a2)=Pr(a1)+Pr(a2)-Pr(a1a2)• This is not good enough if =2#rv’s
(rv’s=Random Variables) and #rv’s is large
![Page 5: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/5.jpg)
Example: How Many RV’s?
![Page 6: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/6.jpg)
Independent Random Variables
• Two variables X and Y are independent if– P(X = x|Y = y) = P(X = x) for all values x,y– That is, learning the values of Y does not
change prediction of X
• If X and Y are independent then – P(X,Y) = P(X|Y)P(Y) = P(X)P(Y)
• In general, if X1,…,Xp are independent, then P(X1,…,Xp)= P(X1)...P(Xp)
– Requires O(n) parameters
![Page 7: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/7.jpg)
Conditional Independence
• Unfortunately, most of random variables of interest are not independent of each other
• A more suitable notion is that of conditional independence
• Two variables X and Y are conditionally independent given Z if– P(X = x|Y = y,Z=z) = P(X = x|Z=z) for all values x,y,z– That is, learning the values of Y does not change
prediction of X once we know the value of Z
– notation: I ( X , Y | Z )
![Page 8: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/8.jpg)
Modeling assumptions:Ancestors can effect descendants' genotype only by passing genetic materials through intermediate generations
Example: Family trees
Noisy stochastic process:
Example: Pedigree
• A node represents an individual’sgenotype
Homer
Bart
Marge
Lisa Maggie
![Page 9: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/9.jpg)
Markov Assumption• We now make this
independence assumption more precise for directed acyclic graphs (DAGs)
• Each random variable X, is independent of its non-descendents, given its parents Pa(X)
• Formally,I (X, NonDesc(X) | Pa(X))
Descendent
Ancestor
Parent
Non-descendent
X
Y1 Y2
Non-descendent
![Page 10: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/10.jpg)
Markov Assumption Example
• In this example:– I ( E, B )– I ( B, {E, R} )– I ( R, {A, B, C} | E )– I ( A, R | B,E )– I ( C, {B, E, R} | A)
Earthquake
Radio
Burglary
Alarm
Call
![Page 11: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/11.jpg)
I-Maps• A DAG G is an I-Map of a distribution P if all
Markov assumptions implied by G are satisfied by P(Assuming G and P both use the same set of random variables)
Examples:
X Y
x y P(x,y)0 0 0.250 1 0.251 0 0.251 1 0.25
X Y
x y P(x,y)0 0 0.20 1 0.31 0 0.41 1 0.1
![Page 12: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/12.jpg)
Factorization
• Given that G is an I-Map of P, can we simplify the representation of P?
• Example:
• Since I(X,Y), we have that P(X|Y) = P(X)
• Applying the chain ruleP(X,Y) = P(X|Y) P(Y) = P(X) P(Y)
• Thus, we have a simpler representation of P(X,Y)
X Y
![Page 13: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/13.jpg)
Factorization Theorem
From assumption: )X(NonDesc)X(Pa}X,X{
}X,X{)X(Pa
ii1i,1
1i,1i
i
iip1 ))X(Pa|X(P)X,...,X(P
Thm: if G is an I-Map of P, then
i
1i1ip1 )X,...,X|X(P)X,...,X(PProof:• By chain rule:
• wlog. X1,…,Xp is an ordering consistent with G
• Since G is an I-Map, I (Xi, NonDesc(Xi)| Pa(Xi))
• We conclude, P(Xi | X1,…,Xi-1) = P(Xi | Pa(Xi) )
))X(Pa|)X(Pa}X,X{,X(I ii1i,1i • Hence,
![Page 14: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/14.jpg)
Factorization Example
P(C,A,R,E,B) = P(B)P(E|B)P(R|E,B)P(A|R,B,E)P(C|A,R,B,E)
Earthquake
Radio
Burglary
Alarm
Call
versusP(C,A,R,E,B) = P(B) P(E) P(R|E) P(A|B,E) P(C|A)
![Page 15: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/15.jpg)
Consequences
• We can write P in terms of “local” conditional probabilities
If G is sparse,– that is, |Pa(Xi)| < k ,
each conditional probability can be specified compactly– e.g. for binary variables, these require O(2k) params.
representation of P is compact– linear in number of variables
![Page 16: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/16.jpg)
Summary
We defined the following concepts• The Markov Independences of a DAG G
– I (Xi , NonDesc(Xi) | Pai )
• G is an I-Map of a distribution P– If P satisfies the Markov independencies implied by G
We proved the factorization theorem• if G is an I-Map of P, then
i
iin1 )Pa|X(P)X,...,X(P
![Page 17: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/17.jpg)
• Let Markov(G) be the set of Markov Independencies implied by G
• The factorization theorem shows
G is an I-Map of P
• We can also show the opposite:
Thm:
G is an I-Map of P
Conditional Independencies
i
iin PaXPXXP )|(),...,( 1
i
iin PaXPXXP )|(),...,( 1
![Page 18: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/18.jpg)
Proof (Outline)
Example:X
Y
Z
)|()()|()|()(
),(),,(
),|(XYPXP
XZPXYPXPYXPZYXP
YXZP
)|( XZP
![Page 19: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/19.jpg)
Implied Independencies
• Does a graph G imply additional independencies as a consequence of Markov(G)?
• We can define a logic of independence statements
• Some axioms:– I( X ; Y | Z ) I( Y; X | Z )
– I( X ; Y1, Y2 | Z ) I( X; Y1 | Z )
![Page 20: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/20.jpg)
d-separation
• A procedure d-sep(X; Y | Z, G) that given a DAG G, and sets X, Y, and Z returns either yes or no
• Goal: d-sep(X; Y | Z, G) = yes iff I(X;Y|Z) follows
from Markov(G)
![Page 21: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/21.jpg)
Paths
• Intuition: dependency must “flow” along paths in the graph
• A path is a sequence of neighboring variables
Examples:• R E A B• C A E R
Earthquake
Radio
Burglary
Alarm
Call
![Page 22: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/22.jpg)
Paths
• We want to know when a path is– active -- creates dependency between end
nodes– blocked -- cannot create dependency end
nodes
• We want to classify situations in which paths are active.
![Page 23: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/23.jpg)
Blocked Unblocked
E
R A
E
R A
Path Blockage
Three cases:– Common cause
–
–
Blocked Active
![Page 24: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/24.jpg)
Blocked Unblocked
E
C
A
E
C
A
Path Blockage
Three cases:– Common cause
– Intermediate cause
–
Blocked Active
![Page 25: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/25.jpg)
Blocked Unblocked
E B
A
C
E B
A
CE B
A
C
Path Blockage
Three cases:– Common cause
– Intermediate cause
– Common Effect
Blocked Active
![Page 26: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/26.jpg)
Path Blockage -- General Case
A path is active, given evidence Z, if• Whenever we have the configuration
B or one of its descendents are in Z
• No other nodes in the path are in Z
A path is blocked, given evidence Z, if it is not active.
A C
B
![Page 27: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/27.jpg)
A
– d-sep(R,B)?
Example
E B
C
R
![Page 28: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/28.jpg)
– d-sep(R,B) = yes– d-sep(R,B|A)?
Example
E B
A
C
R
![Page 29: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/29.jpg)
– d-sep(R,B) = yes– d-sep(R,B|A) = no– d-sep(R,B|E,A)?
Example
E B
A
C
R
![Page 30: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/30.jpg)
d-Separation
• X is d-separated from Y, given Z, if all paths from a node in X to a node in Y are blocked, given Z.
• Checking d-separation can be done efficiently (linear time in number of edges)– Bottom-up phase:
Mark all nodes whose descendents are in Z– X to Y phase:
Traverse (BFS) all edges on paths from X to Y and check if they are blocked
![Page 31: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/31.jpg)
Soundness
Thm: If – G is an I-Map of P– d-sep( X; Y | Z, G ) = yes
• then– P satisfies I( X; Y | Z )
Informally: Any independence reported by d-separation is satisfied by underlying distribution
![Page 32: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/32.jpg)
Completeness
Thm: If d-sep( X; Y | Z, G ) = no
• then there is a distribution P such that– G is an I-Map of P– P does not satisfy I( X; Y | Z )
Informally: Any independence not reported by d-separation might be violated by the underlying distribution
• We cannot determine this by examining the graph structure alone
![Page 33: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/33.jpg)
Summary: Structure
• We explored DAGs as a representation of conditional independencies:
– Markov independencies of a DAG
– Tight correspondence between Markov(G) and the factorization defined by G
– d-separation, a sound & complete procedure for computing the consequences of the independencies
– Notion of minimal I-Map
– P-Maps
• This theory is the basis for defining Bayesian networks
![Page 34: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/34.jpg)
Inference
• We now have compact representations of probability distributions:– Bayesian Networks– Markov Networks
• Network describes a unique probability distribution P
• How do we answer queries about P?• We use inference as a name for the process
of computing answers to such queries
![Page 35: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/35.jpg)
Today
1. Probabilistic graphical models
2. Treewidth methods:1. Variable elimination
2. Clique tree algorithm
3. Applications du jour: Sensor Networks
![Page 36: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/36.jpg)
Queries: Likelihood
• There are many types of queries we might ask. • Most of these involve evidence
– An evidence e is an assignment of values to a set E variables in the domain
– Without loss of generality E = { Xk+1, …, Xn }
• Simplest query: compute probability of evidence
• This is often referred to as computing the likelihood of the evidence
1x
1 ),,,( )(kx
kxxPP ee
![Page 37: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/37.jpg)
Queries: A posteriori belief• Often we are interested in the conditional
probability of a variable given the evidence
• This is the a posteriori belief in X, given evidence e
• A related task is computing the term P(X, e) – i.e., the likelihood of e and X = x for values of
X
x
xXPxXP
xXP),(
),()|(
ee
e
)(),(
)|(eP
eXPeXP
![Page 38: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/38.jpg)
A posteriori beliefThis query is useful in many cases:
• Prediction: what is the probability of an outcome given the starting condition– Target is a descendent of the evidence
• Diagnosis: what is the probability of disease/fault given symptoms– Target is an ancestor of the evidence
• the direction between variables does not restrict the directions of the queries
![Page 39: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/39.jpg)
Queries: MAP
• In this query we want to find the maximum a posteriori assignment for some variable of interest (say X1,…,Xl )
• That is, x1,…,xl maximize the probability
P(x1,…,xl | e)
• Note that this is equivalent to maximizing
P(x1,…,xl, e)
![Page 40: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/40.jpg)
Queries: MAP
We can use MAP for:
• Classification – find most likely label, given the evidence
• Explanation – What is the most likely scenario, given the
evidence
![Page 41: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/41.jpg)
Complexity of Inference
Thm:
Computing P(X = x) in a Bayesian network is NP-hard
Not surprising, since we can simulate Boolean gates.
![Page 42: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/42.jpg)
Approaches to inference
• Exact inference –Inference in Simple Chains–Variable elimination–Clustering / join tree algorithms
• Approximate inference – later in semester–Stochastic simulation / sampling methods–Markov chain Monte Carlo methods–Mean field theory – your presentation
![Page 43: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/43.jpg)
Variable Elimination
General idea:• Write query in the form
• Iteratively– Move all irrelevant terms outside of innermost sum– Perform innermost sum, getting a new term– Insert the new term into the product
kx x x i
iin paxPXP3 2
)|(),( e
![Page 44: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/44.jpg)
Example
Visit to Asia
Smoking
Lung CancerTuberculosis
Abnormalityin Chest
Bronchitis
X-Ray Dyspnea
• “Asia” network:
![Page 45: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/45.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()(),,,,,,,(
badPaxPltaPsbPslPvtPsPvPdxbaltsvP
• We want to compute P(d)• Need to eliminate: v,s,x,t,l,a,b
Initial factors
“Brute force approach”
P(d) P(v,s,t,l,a,b,x,d)v
s
t
l
a
b
x
Complexity is exponential in the size of the graph (number of variables) = T. N=number of states for each variable
O(NT )
![Page 46: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/46.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: v,s,x,t,l,a,bInitial factors
Eliminate: v
Note: fv(t) = P(t)In general, result of elimination is not necessarily a probability term
Compute: v
v vtPvPtf )|()()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
![Page 47: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/47.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: s,x,t,l,a,b
• Initial factors
Eliminate: s
Summing on s results in a factor with two arguments fs(b,l)In general, result of elimination may be a function of several variables
Compute: s
s slPsbPsPlbf )|()|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
![Page 48: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/48.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: x,t,l,a,b
• Initial factors
Eliminate: x
Note: fx(a) = 1 for all values of a !!
Compute: x
x axPaf )|()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
![Page 49: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/49.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: t,l,a,b
• Initial factors
Eliminate: t
Compute: t
vt ltaPtflaf ),|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|(),()(),( badPlafaflbf txs
![Page 50: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/50.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: l,a,b
• Initial factors
Eliminate: l
Compute: l
tsl laflbfbaf ),(),(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|(),()(),( badPlafaflbf txs
),|()(),( badPafbaf xl
![Page 51: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/51.jpg)
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: b
• Initial factors
Eliminate: a,bCompute:
b
aba
xla dbfdfbadpafbafdbf ),()(),|()(),(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|()(),( badPafbaf xl),|(),()(),( badPlafaflbf txs
)(),( dfdbf ba
![Page 52: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/52.jpg)
V S
LT
A B
X D
P(v)P(s)P(t | v)P(l | s)P(b | s)P(a | t,l)P(x | a)P(d | a,b)
• Different elimination ordering:• Need to eliminate: a,b,x,t,v,s,l
• Initial factors
ga (l,t,d,b,x)
gb (l,t,d,x,s)
gx (l, t,d,s)
gt (l,t,s,v)
gv (l,d,s)
gs(l,d)
gl (d)
Intermediate factors:
Complexity is exponential in the size of the factors!
![Page 53: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/53.jpg)
Variable Elimination
• We now understand variable elimination as a sequence of rewriting operations
• Actual computation is done in elimination step
• Exactly the same computation procedure applies to Markov networks
• Computation depends on order of elimination
![Page 54: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/54.jpg)
Markov Network(Undirected Graphical Models)
• A graph with hyper-edges (multi-vertex edges)
• Every hyper-edge e=(x1…xk) has a potential function fe(x1…xk)
• The probability distribution is
11
11
),...,(.../1
),...,(),...,(
x xn Eeekee
Eeekeen
xxfZ
xxfZXXP
![Page 55: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/55.jpg)
x
kxkx yyxfyyf ),,,('),,( 11
m
ilikx i
yyxfyyxf1
,1,1,11 ),,(),,,('
Complexity of variable elimination
• Suppose in one elimination step we compute
This requires • multiplications
– For each value for x, y1, …, yk, we do m multiplications
• additions
– For each value of y1, …, yk , we do |Val(X)| additionsComplexity is exponential in number of variables in the
intermediate factor
i
iYXm )Val()Val(
i
iYX )Val()Val(
![Page 56: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/56.jpg)
Undirected graph representation• At each stage of the procedure, we have an
algebraic term that we need to evaluate• In general this term is of the form:
where Zi are sets of variables• We now plot a graph where there is undirected edge
X--Y if X,Y are arguments of some factor– that is, if X,Y are in some Zi
• Note: this is the Markov network that describes the probability on the variables we did not eliminate yet
1
)(),,( 1y y i
ikn
fxxP iZ
![Page 57: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/57.jpg)
Chordal Graphs
• elimination ordering undirected chordal graph
Graph:• Maximal cliques are factors in elimination• Factors in elimination are cliques in the graph• Complexity is exponential in size of the largest
clique in graph
LT
A B
X
V S
D
V S
LT
A B
X D
![Page 58: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/58.jpg)
Induced Width
• The size of the largest clique in the induced graph is thus an indicator for the complexity of variable elimination
• This quantity is called the induced width of a graph according to the specified ordering
• Finding a good ordering for a graph is equivalent to finding the minimal induced width of the graph
![Page 59: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/59.jpg)
PolyTrees
• A polytree is a network where there is at most one path from one variable to another
Thm:• Inference in a polytree is linear in the
representation size of the network– This assumes tabular CPT representation
A
CB
D E
F G
H
![Page 60: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/60.jpg)
Today
1. Probabilistic graphical models
2. Treewidth methods:1. Variable elimination
2. Clique tree algorithm
3. Applications du jour: Sensor Networks
![Page 61: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/61.jpg)
Junction Tree• Why junction tree?
– More efficient for some tasks than variable elimination
– We can avoid cycles if we turn highly-interconnected subsets of the nodes into “supernodes” cluster
• Objective– Compute
• is a value of a variable and is evidence for a set of variable
)|( eEvVP
v V eE
![Page 62: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/62.jpg)
Properties of Junction Tree• An undirected tree• Each node is a cluster (nonempty set)
of variables• Running intersection property:
– Given two clusters and , all clusters on the path between and contain
• Separator sets (sepsets): – Intersection of the adjacent cluster
X YXY YX
ADEABD DEFAD DE
Cluster ABDSepset DE
![Page 63: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/63.jpg)
Potentials
• Potentials: – Denoted by
• Marginalization– , the marginalization of into X
• Multiplication– , the multiplication of and
:X R {0}X
X\Y
YX YX
Y
YXZ YX
YXZ
![Page 64: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/64.jpg)
Properties of Junction Tree
• Belief potentials: – Map each instantiation of clusters or sepsets into a
real number
• Constraints:– Consistency: for each cluster and neighboring
sepset
– The joint distribution
XS
SS\X
X
j
i
j
iPS
XU
)(
![Page 65: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/65.jpg)
Properties of Junction Tree
• If a junction tree satisfies the properties, it follows that:– For each cluster (or sepset) ,
– The probability distribution of any variable , using any cluster (or sepset) that contains
X
)(XX P
VX V
}\{
)(V
VPX
X
![Page 66: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/66.jpg)
Building Junction Trees
DAG
Moral Graph
Triangulated Graph
Junction Tree
Identifying Cliques
![Page 67: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/67.jpg)
Constructing the Moral Graph
A
B
D
C
E
G
F
H
![Page 68: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/68.jpg)
Constructing The Moral Graph
• Add undirected edges to all co-parents which are not currently joined –Marrying parents
A
B
D
C
E
G
F
H
![Page 69: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/69.jpg)
Constructing The Moral Graph
• Add undirected edges to all co-parents which are not currently joined –Marrying parents
• Drop the directions of the arcs
A
B
D
C
E
G
F
H
![Page 70: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/70.jpg)
Triangulating
• An undirected graph is triangulated iff every cycle of length >3 contains an edge to connects two nonadjacent nodes
A
B
D
C
E
G
F
H
![Page 71: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/71.jpg)
Identifying Cliques
• A clique is a subgraph of an undirected graph that is complete and maximal
A
B
D
C
E
G
F
H
EGH
ADEABD
ACEDEF
CEG
![Page 72: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/72.jpg)
Junction Tree
• A junction tree is a subgraph of the clique graph that – is a tree – contains all the cliques– satisfies the running intersection property
EGH
ADEABD
ACEDEF
CEG
ADEABD ACEAD AE CEGCE
DEF
DE
EGH
EG
![Page 73: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/73.jpg)
Principle of Inference
DAG
Junction Tree
Inconsistent Junction Tree
Initialization
Consistent Junction Tree
Propagation
)|( eEvVP
Marginalization
![Page 74: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/74.jpg)
Example: Create Join Tree
X1 X2
Y1 Y2
HMM with 2 time steps:
Junction Tree:
X1,X2X1,Y1 X2,Y2X1 X2
![Page 75: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/75.jpg)
Example: Initialization
VariableAssociated
ClusterPotential function
X1 X1,Y1
Y1 X1,Y1
X2 X1,X2
Y2 X2,Y2
X1,Y1 P(X1)
X1,Y1 P(X1)P(Y1 | X1)
X1,X 2 P(X2 | X1)
X 2,Y 2 P(Y2 | X2)
X1,X2X1,Y1 X2,Y2X1 X2
![Page 76: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/76.jpg)
Example: Collect Evidence
• Choose arbitrary clique, e.g. X1,X2, where all potential functions will be collected.
• Call recursively neighboring cliques for messages:
• 1. Call X1,Y1.– 1. Projection:
– 2. Absorption:
X1 X1,Y1 P(X1,Y1) P(X1)Y1
{X1,Y1} X1
X1,X 2 X1,X 2
X1
X1old
P(X2 | X1)P(X1) P(X1,X2)
![Page 77: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/77.jpg)
Example: Collect Evidence (cont.)
• 2. Call X2,Y2:– 1. Projection:
– 2. Absorption:
X 2 X 2,Y 2 P(Y2 | X2) 1Y 2
{X 2,Y 2} X 2
X1,X2X1,Y1 X2,Y2X1 X2
X1,X 2 X1,X 2
X 2
X 2old
P(X1,X2)
![Page 78: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/78.jpg)
Example: Distribute Evidence
• Pass messages recursively to neighboring nodes
• Pass message from X1,X2 to X1,Y1:– 1. Projection:
– 2. Absorption:
X1 X1,X 2 P(X1,X2) P(X1)X 2
{X1,X 2} X1
X1,Y1 X1,Y1
X1
X1old
P(X1,Y1)P(X1)
P(X1)
![Page 79: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/79.jpg)
Example: Distribute Evidence (cont.)
• Pass message from X1,X2 to X2,Y2:– 1. Projection:
– 2. Absorption:
X 2 X1,X 2 P(X1,X2) P(X2)X1
{X1,X 2} X 2
X 2,Y 2 X 2,Y 2
X 2
X 2old P(Y2 | X2)
P(X2)
1P(Y2,X2)
X1,X2X1,Y1 X2,Y2X1 X2
![Page 80: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/80.jpg)
Example: Inference with evidence
• Assume we want to compute: P(X2|Y1=0,Y2=1) (state estimation)
• Assign likelihoods to the potential functions during initialization:
X1,Y1 0 if Y11
P(X1,Y10) if Y10
X 2,Y 2 0 if Y2 0
P(Y2 1 | X2) if Y2 1
![Page 81: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/81.jpg)
Example: Inference with evidence (cont.)
• Repeating the same steps as in the previous case, we obtain:
X1,Y1 0 if Y11
P(X1,Y10,Y2 1) if Y10
X1 P(X1,Y10,Y2 1)
X1,X 2 P(X1,Y10,X2,Y2 1)
X 2 P(Y10,X2,Y2 1)
X 2,Y 2 0 if Y2 0
P(Y10,X2,Y2 1) if Y2 1
![Page 82: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/82.jpg)
Next Time
• Inference with Propositional Logic
• Later in the semester: (a) Approximate Probabilistic Inference via
sampling: Gibbs, Priority, MCMC
(b) Approximate Probabilistic Inference using a close, simpler distribution
![Page 83: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/83.jpg)
THE END
![Page 84: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/84.jpg)
Example: Naïve Bayesian Model
• A common model in early diagnosis:– Symptoms are conditionally independent given the disease (or
fault)
• Thus, if – X1,…,Xp denote whether the symptoms exhibited by the patient
(headache, high-fever, etc.) and – H denotes the hypothesis about the patients health
• then, P(X1,…,Xp,H) = P(H)P(X1|H)…P(Xp|H),
• This naïve Bayesian model allows compact representation– It does embody strong independence assumptions
![Page 85: Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro](https://reader036.vdocuments.mx/reader036/viewer/2022062518/56649f4c5503460f94c6d087/html5/thumbnails/85.jpg)
Elimination on Trees
• Formally, for any tree, there is an elimination ordering with induced width = 1
Thm
• Inference on trees is linear in number of variables