Rome - Feb. 2010
Lesson 6
Belief propagation
Sergio Barbarossa
Rome - Feb. 2010
Pairwise Markov Random Fields
Scene: (for example, the pixels of an image)
- The scene of interest must have a structure
- A general structure model is the statistical dependency among the pixels of the scene
- The field is pairwise Markov if the statistical dependency can be expresses through functions of pairs of nodes
Observation:
- The observation is itself related to the scene through a statistical model
Observation model
Rome - Feb. 2010
The best way to model a pairwise statistical dependency is a graph representation, where each vertex represents a random variable
There is an edge between the nodes representing the random variables and if and only if the compatibility function of the pair is different from zero
The overall statistical dependency is
The goal is to recover the underlying field from the observations
Observation model
Rome - Feb. 2010
Examples
x2 x1
x3
x4
y1 y2
y3
y4
x1
y1 y2 y3
y4
The number of observation could be equal to the number of unknowns
The number of observation could be greater than the number of unknowns
Observation model
Rome - Feb. 2010
Standard belief propagation
Goal: allow each node to compute its belief (a posteriori probability), given the observations of all the nodes, in a totally distributed way
Each variable exchanges messages only with its neighbors
In a MRF with discrete rv’s, the messages are vectors of size equal to the number of values that the rv can assume
The message sent from node i to node j is about what state node j should be in, according to node i:
Belief propagation
Rome - Feb. 2010
The belief computed at node i is proportional to the product of the local evidence and the messages coming into node i
Belief propagation
Flow of messages
Rome - Feb. 2010
Belief propagation
x2 x1
x3
x4
y1 y2
y3
y4
Belief at node 1:
Example
Rome - Feb. 2010
Belief propagation
BP is a distributed way to compute a marginal pdf
If the graph has no loops, BP provides the exact marginal at every node in a finite number of steps
The whole computation takes a time proportional to the number of links, which is significantly less than the exponential growth resulting from the straightforward saturation of variables
Substituting
Rome - Feb. 2010
Belief propagation
BP is indeed a way to organize global computations of marginal beliefs in terms of local (simpler) computations
The BP algorithm described above does not make any reference to the graph topology. However, if there are loops and they are ignored, the messages could circulate forever and the process might not converge or it might converge to an incorrect value
Rome - Feb. 2010
Consensus estimation via belief propagation
Consensus estimation can be interpreted as BP over the following graph, where yi denote the observation variables and x is the common parameter to be estimated
x
y1 y2 y3
y4
The evidence function in this case is
The compatibility function is
The message is
Rome - Feb. 2010
Consensus estimation via belief propagation
Taking logarithms
Distributed consensus ends up into a (local) linear combination of logarithms of the messages
In the Gaussian case, the nodes must exchange only mean vector and covariance matrices
The convergence depends on the topology of the graph
Rome - Feb. 2010
Consensus estimation via belief propagation
Given
we can define
and use the alternative notation
Let
and consider the product
and
and
Rome - Feb. 2010
Consensus estimation via belief propagation
Then
with
Observation model
with
and
Rome - Feb. 2010
Consensus estimation via belief propagation
Messages at step n
After a number of steps proportional to the number of links, every node has
And then it is able to compute the globally optimal ML estimate
and
Rome - Feb. 2010
References
[1] J. Yedidia, W. Freeman, Y. Weiss, “Understanding Belief Propagation and its Generalizations”
[2] Y. Weiss, W. Freeman, “Correctness of belief propagation in Gaussian graphical models of arbitrary topology”