第十讲概率图模型导论 chapter 10 introduction to probabilistic graphical models

第十讲概率图模型导论 Chapter 10 Introduction to Probabilistic Gra

phical Models

Weike Pan, and Congfu Xu{panweike, xucongfu}@zju.edu.cn

Institute of Artificial Intelligence

College of Computer Science, Zhejiang University

October 12, 2006

浙江大学计算机学院《人工智能引论》课件

References

An Introduction to Probabilistic Graphical Models. Michael I. Jordan.

http://www.cs.berkeley.edu/~jordan/graphical.html

Outline

PreparationsProbabilistic Graphical Models (PGM)

Directed PGM Undirected PGM

Insights of PGM

Outline

Preparations PGM “is” a universal model

Different thoughts of machine learning Different training approaches Different data types

Bayesian Framework Chain rules of probability theory Conditional Independence

Probabilistic Graphical Models (PGM) Directed PGM Undirected PGM

Insights of PGM

Different thoughts of machine learning

Statistics (modeling uncertainty, detailed information) vs. Logics (modeling complexity, high level information)

Unifying Logical and Statistical AI. Pedro Domingos, University of Washington. AAAI 2006.

Speech: Statistical information (Acoustic model + Language model + Affect model…) + High level information (Expert/Logics)

Different training approaches

Maximum Likelihood Training: MAP (Maximum a Posteriori)

vs. Discriminative Training: Maximum Margin (SVM)

Speech: classical combination – Maximum Likelihood + Discriminative Training

Different data types

Directed acyclic graph (Bayesian Networks, BN) Modeling asymmetric effects and dependencies:

causal/temporal dependence (e.g. speech analysis, DNA sequence analysis…)

Undirected graph (Markov Random Fields, MRF) Modeling symmetric effects and dependencies: spatial

dependence (e.g. image analysis…)

PGM “is” a universal model

To model both temporal and spatial data, by unifying Thoughts: Statistics + Logics Approaches: Maximum Likelihood Training + Discriminative

Training

Further more, the directed and undirected models together provide modeling power beyond that which could be provided by either alone.

Bayesian Framework

( | ) ( )( | )

( )i i

P O c P cP c O

What we care is the conditional probability, and it’s is a ratio of two marginal probabilities.

A posteriori probability

Likelihood Priori probability

Class iNormalization factor

Observation

Problem description Observation Conclusion (classification or prediction)

Bayesian rule

Chain rules of probability theory

Conditional Independence

Outline

Insights of PGM

Nodes represent random variables/states The missing arcs represent conditional independence assumptions

The graph structure implies the decomposition

Directed PGM (BN)

Representation

Probability Distribution Queries

Implementation

Interpretation

Probability Distribution

Definition of Joint Probability Distribution

( , ) 1i

f x x ( , ) 0ii if x x

Check:

Representation

Graphical models represent joint probability distributions more economically, using a set of “local” relationships among variables.

Conditional Independence (basic)

Assert the conditional independence of a node from its ancestors, conditional on its parents.

Interpret missing edges in terms of conditional independence

Conditional Independence (3 canonical graphs)

Classical Markov chain“Past”, “present”,

“future”

Common causeY “explains” all the dependencies

between X and Z

Marginal Independence

Common effect Multiple, competing explanation

( , , ) ( ) ( ) ( | , )

( , , ) ( ) ( )

p x y z p x p z p y x z

p x y zp x p z

( , ) ( ) ( )p x z p x p z

Conditional Independence (check)

One incoming arrow and one outgoing arrow

Two outgoing arrows Two incoming arrows

Check through reachability

Bayes ball algorithm (rules)

Outline

Insights of PGM

Undirected PGM (MRF)

Representation

Probability Distribution Queries

Implementation

Interpretation

Probability Distribution(1)

Clique A clique of a graph is a fully-connected subset of nodes. Local functions should not be defined on domains of nodes that extend

beyond the boundaries of cliques.

Maximal cliques The maximal cliques of a graph are the cliques that cannot be extended

to include additional nodes without losing the probability of being fully connected.

We restrict ourselves to maximal cliques without loss of generality, as it captures all possible dependencies.

Potential function (local parameterization) : potential function on the possible realizations of the maximal

clique ( )

CX CxCx

Maximal cliques

Joint probability distribution

Normalization factor

1( ) ( )

p x xZ

( )CX C

1( ) ( )

1 exp{ ( )}

p x xZ

exp{ ( )}

CX Cx C

Boltzman distribution

It’s a “reachability” problem in graph theory.

Representation

Outline

Insights of PGM

Insights of PGM (Michael I. Jordan)

Probabilistic Graphical Models are a marriage between probability theory and graph theory.

A graphical model can be thought of as a probabilistic database, a machine that can answer “queries” regarding the values of sets of random variables.

We build up the database in pieces, using probability theory to ensure that the pieces have a consistent overall interpretation. Probability theory also justifies the inferential machinery that allows the pieces to be put together “on the fly” to answer the queries.

In principle, all “queries” of a probabilistic database can be answered if we have in hand the joint probability distribution.

Insights of PGM (data structure & algorithm)

A graphical model is a natural/perfect tool for representation(数据结构 ) and inference (算法 ).

Thanks!

第十讲概率图模型导论 chapter 10 introduction to probabilistic graphical models

Documents

《齐鲁文化概论》面授辅导

金融工程导论讲师：何志刚，倪禾 * email:...

英美文学导论 ( 第二讲 ) 主讲教师 : 林春阳....

计算机导论 — 总体介绍

导论

《大学生健康导论》

第一章财政导论

《田径运动》理论课讲稿

第 2 讲进化论与遗传学

2012-2013学年第一学期第五讲机器人导论

第六章空间分析导论

中国儒学导论顾士敏

tutorial 7 师生讨论讲评

第 5 章电分析化学导论

第三讲文学风格论

第 1 章 _ 物联网导论

毕业论文写作辅导

国际商务导论 fundamentals of business

国际商务导论 fundamentals of business

第二讲信息组织方法论

第十讲 概率图模型导论 chapter 10 introduction to probabilistic graphical models

第十讲概率图模型导论 chapter 10 introduction to probabilistic graphical models