bayesian network by dengke dong. key points today intro to graphical model conditional...

Post on 02-Jan-2016

221 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Bayesian Network

By DengKe Dong

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Intro to Graphical Model

Two types of graphical models:

-- Directed graphs (aka Bayesian Networks)

-- Undirected graphs (aka Markov Random Fields)

Graphical Structure

Plus associated parameters define joint probability

distribution over of variables / nodes

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Conditional Independence

DefinitionX is conditionally independent of Y given Z, if the probability

distribution governing X is independent of the value of Y, given

the value of Z

we denote it as: P (X ⊥ Y | Z), if

Conditional Independence

Condition on its parentsA conditional probability distribution (CPD) is associated

with each node N, defining as:

P(N | Parents(N))

where the function Parents(N) returns the set of N’s immediate

parents

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Bayesian Network Definition

Definition:A directed acyclic graph defining a joint probability distribution

over a set of variables,

where each node denotes a random variable, and each edge

denotes the dependence between the connected nodes.

for example:

Bayesian Network Definition

Conditional Independencies in Bayesian Network:

Each node is conditionally independent of its non-descendents,

given only its immediate parents,

So, the joint distribution over all variables in the network is

defined in terms of these CPD’s, plus the graph.

Bayesian Network Definition example: Chain rules for Probability:

P(S,L,R,T,W) = P(S)P(L|S)P(R|S,L)P(T|S,L,R)P(W|S,L,R,T)

CPD for each node Xi describing as P(Xi | Pa(Xi)):

P(S,L,R,T,W) = P(S)P(L|S)P(R|S)P(T|L)P(W|L,R)

So, in a Bayes net

Bayesian Network Definition

Construction Choose an ordering over variables, e.g. X1, X2, …, Xn

For i=1 to n

Add Xi to the network

Select parents Pa(Xi) as minimal subset of X1…Xi-1, such that

Notice this choice of parents assures

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Reasoning BN: D-Separation

Conditional Independence, Revisited We said:

-- Each node is conditionally independent of its non-descendents,

given its immediate parents

Does this rule given us all of the conditional independence

relations implied by the Bayes Network ?a. NO

b. E.g., X1 and X4 are conditionally independent given {X2, X3}

c. But X1 and X4 not conditionally independent given X3

d. For this, we need to understand D-separation …

Reasoning BN: D-Separation

Three example to understand D-Separation Head to Tail

Tail to Tail

Head to Head

Reasoning BN: D-Separation

Head to TailP(a,b,c) = P(a) P(c|a) P(b|c)

Given C

P(a,b|c) = P(a,b,c) / P(c) = P(a|c) P(b|c)

Not Given C

P(a,b|c) = not equal to P(a|c) P(b|c)

Reasoning BN: D-Separation

Tail to TailP(a,b,c) = P(c) P(a|c) P(b|c)

Given C

P(a,b|c) = P(a,b,c) / P(c) = P(a|c) P(b|c)

Not Given C

P(a,b|c) = not equal to P(a|c) P(b|c)

Reasoning BN: D-Separation

Head to HeadP(a,b,c) = P(c|a,b) P(a) P(b)

Given C

P(a,b|c) = P(a,b,c) / P(c) not equal to P(a|c) P(b|c)

Not Given C

P(a,b|c) = = P(a) P(b)

X and Y are conditionally independent given Z, if and only if X and Y are D-separated by Z

Suppose we have three sets of random variables: X, Y and Z

X and Y are D-separated by Z (and therefore conditionally independence, given Z)

iff every path from any variable in X to any variable in Y is blocked

A path from variable A to variable B is blocked if it includes a node such that either arrows on the path meet either head-to-tail or tail-to-tail at the

node and this node is in Z

the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in Z

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Inference In Bayesian Network In general, intractable (NP-Complete)

For certain case, tractable

• Assigning probability to fully observed set of variables

• Or if just one variable unobserved

• Or for singly connected graphs (ie., no undirected loops)

• Variable elimination

• Belief propagation

For multiply connected graphs

• Junction tree

Sometimes use Monte Carlo method

• Generate many samples according to the Bayes Net distribution,

then count up the results

Variational methods for tractable approximate solutions

Inference In Bayesian Network Prob. of Joint assignment: easy

Suppose we are interested in joint assignment

<F=f,A=a,S=s,H=h,N=n>

What is P(f,a,s,h,n)?

Inference In Bayesian Network Prob. of Joint assignment: easy

Suppose we are interested in joint assignment

<F=f,A=a,S=s,H=h,N=n>

What is P(f,a,s,h,n)?

Inference In Bayesian Network Prob. of marginals: not so easy

How do we calculate P(N = n) ?

Inference In Bayesian Network Prob. of marginals: not so easy

How do we calculate P(N = n) ?

P(N = n) =

=

=

Inference In Bayesian Network Generating a sample from joint distribution:

easy How can we generate random samples

according to P(F, A, S, H, N)?

Randomly draw a value for F=f

draw r [0, 1] uniformly

f r < , then output f=1, else f = 0

Note we can estimate marginal like P(N=n) by generating many

samples from joint distribution, by summing up the probability

mass for which N = n

Similarly, for anything else we can care P(F=1 |H=1, N=0)

Inference On a Chain Converting Directed to Undirected Graphs

Inference On a Chain Converting Directed to Undirected Graphs

Inference On a ChainCompute the marginals

Inference On a ChainCompute the marginals

Inference On a ChainCompute the marginals

Inference On a ChainCompute the marginals

Inference On a Chain

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Belief Propagation

基于贝叶斯网络、 MRF’s 、因子图的置信传播算法已分别被开发出来,基于 不同模型的置信传播算法在数学上是等价的。从叙述简洁的角度考虑,这里 主要介绍基于 MRF’s 的标准置信传播算法。

Belief Propagation

基于贝叶斯网络、 MRF’s 、因子图的置信传播算法已分别被开发出来,基于 不同模型的置信传播算法在数学上是等价的。从叙述简洁的角度考虑,这里 主要介绍基于 MRF’s 的标准置信传播算法 以马尔可夫随机场为例

给定某个观察得到的状态,推断其对应的潜在状态

Belief Propagation

基于贝叶斯网络、 MRF’s 、因子图的置信传播算法已分别被开发出来,基于 不同模型的置信传播算法在数学上是等价的。从叙述简洁的角度考虑,这里 主要介绍基于 MRF’s 的标准置信传播算法 以马尔可夫随机场为例

给定某个观察得到的状态推断其对应的潜在状态

Belief Propagation 引入变量表示隐状态结点 i 传递给隐状态结点 j 的信息( message ) 表明了结点 j 应该处于何种状态 的维度与相同,的每一维表明结点 i 认为结点 j 处于相应 的状态的可能

Belief Propagation Belief (置信度):

我们近似计算得到的边缘概率成为 belief ,并将结点 i 的置信度表示为 b()

那么结点 i 的 belief 为

eq(1)

其中 k 为归一化因子,保证所有置信度之和为 1 , N(i) 表示结点 i 的所有相邻 结点。置信传播的信息由公式 eq(2) 更新,该公式能够保证信息的一致性

eq(2 )

公式 (2) 中除了由结点 j 传递给结点 i 的信息,其他所有传递给结点 i 的信息都 被连乘起来;另外公式中的求和符号表示将结点 i 的所有可能状态累加起来。

Belief Propagation

在实际计算中, belief 从图边缘的结点开始计算,并且只 更新所有必需信息都已知的信息。利用公式 1 和公式 2 , 依次计算每个结点的 belief 。

对于无环的 MRF’s ,通常情况下,每个信息只需计算一次 ,极大地提升了计算效率

Key Points Today

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Thanks

Learning in Bayesian Network What you should know

Learning in Bayesian Network

Learning CPTs from Fully Observed Data

MLE estimate of from fully observed data

Learning in Bayesian Network

Learning in Bayesian Network

Learning in Bayesian Network

EM Algorithm

EM Algorithm

EM Algorithm

EM Algorithm

Using Unlabeled Data to Help Train Naïve Bayes Classifier

Using Unlabeled Data to Help Train Naïve Bayes Classifier

Using Unlabeled Data to Help Train Naïve Bayes Classifier

Using Unlabeled Data to Help Train Naïve Bayes Classifier

Summary

Intro to Graphical Model Conditional Independence Intro to Bayesian Network Reasoning BN: D-Separation Inference In Bayesian Network Belief Propagation Learning in Bayesian Network

Thanks

top related