![Page 1: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/1.jpg)
Bayes Nets
![Page 2: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/2.jpg)
• Graphical models are a marriage between probability theory and graph theory.They provide a natural tool for dealing with two problems that occur throughoutapplied mathematics and engineering { uncertainty and complexity { and inparticular they are playing an increasingly important role in the design andanalysis of machine learning algorithms. Fundamental to the idea of a graphicalmodel is the notion of modularity--a complex system is built by combiningsimpler parts. Probability theory provides the glue whereby the parts arecombined, ensuring that the system as a whole is consistent, and providing waysto interface models to data. The graph theoretic side of graphical models providesboth an intuitively appealing interface by which humans can model highly-interacting sets of variables as well as a data structure that lends itself naturally tothe design of efficient general-purpose algorithms.
• Many of the classical multivariate probabalistic systems studied in fields such asstatistics, systems engineering, information theory, pattern recognition andstatistical mechanics are special cases of the general graphical model formalism--examples include mixture models, factor analysis, hidden Markov models,Kalman filters and Ising models. The graphical model framework provides a wayto view all of these systems as instances of a common underlying formalism.This view has many advantages—in particular, specialized techniques that havebeen developed in one field can be transferred between research communitiesand exploited more widely. Moreover, the graphical model formalism provides anatural framework for the design of new systems. Michael Jordan
![Page 3: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/3.jpg)
Graphical Models
• Representation:– Efficiently represent a joint probability distribution
• Inference:– Infer hidden states of the system, given data
• Learning:– Estimate the parameters and structure of the model
from data.• Applications
![Page 4: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/4.jpg)
Example 1
• Pearl's (1988, p. 49) example: Sherlock. Holmesis at work.
• His neighbor Mr. Watson, a practical joker, hascalled to say that his alarm at home has sounded.
• Should Sherlock rush home?• If the alarm really has sounded, it may be because
of a burglary or because of an earthquake. If hehears a radio report of an earthquake, his degree ofconfidence that there was a burglary will diminish.
![Page 5: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/5.jpg)
Qualitative part:Directed acyclic graph
(DAG)• Nodes - random vars.• Edges - direct influence
Quantitative part:Set of conditionalprobability distributions
0.9 0.1
e
be
0.2 0.8
0.01 0.990.9 0.1
bebb
e
BE P(A | E,B)Family of Alarm
Earthquake
Radio
Burglary
Alarm
Call
Compact representation of joint probabilitydistributions via conditional independence
Together:Define a unique distributionin a factored form
†
P(B, E, A,C,R) = P(B)P(E)P(A | B,E)P(R | E)P(C | A)
What is a Bayes (belief) net?
Figure from N. Friedman
![Page 6: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/6.jpg)
What is a Bayes net?
Earthquake
Radio
Burglary
Alarm
Call
C R,B,E | A
A node is conditionally independent of itsancestors given its parents, e.g.
Hence
From 25 – 1 = 31 parameters to 1+1+2+4+2=10
†
C
![Page 7: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/7.jpg)
Why are Bayes nets useful?
- Graph structure supports- Modular representation of knowledge- Local, distributed algorithms for inference and learning- Intuitive (possibly causal) interpretation
- Factored representation may have exponentiallyfewer parameters than full joint P(X1,…,Xn) =>
- lower sample complexity (less data for learning)
- lower time complexity (less time for inference)
![Page 8: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/8.jpg)
What can Bayes nets be used for?• Posterior probabilities
– Probability of any event given any evidence
• Most likely explanation– Scenario that explains evidence
• Rational decision making– Maximize expected utility– Value of Information
• Effect of intervention– Causal analysis
Earthquake
Radio
Burglary
Alarm
Call
Radio
Call
Figure from N. Friedman
Explaining away effect
![Page 9: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/9.jpg)
Example 2
![Page 10: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/10.jpg)
Wet example cont’d
![Page 11: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/11.jpg)
Domain: Monitoring Intensive-Care Patients• 37 variables• 509 parameters …instead of 237
PCWP CO
HRBP
HREKG HRSAT
ERRCAUTERHRHISTORY
CATECHOL
SAO2 EXPCO2
ARTCO2
VENTALV
VENTLUNG VENITUBE
DISCONNECT
MINVOLSET
VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS
PAP SHUNT
ANAPHYLAXIS
MINOVL
PVSAT
FIO2
PRESS
INSUFFANESTHTPR
LVFAILURE
ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME
HYPOVOLEMIA
CVP
BP
A real Bayes net: Alarm
Figure from N. Friedman
![Page 12: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/12.jpg)
More real-world BN applications• “Microsoft’s competitive advantage lies in its expertise in
Bayesian networks”-- Bill Gates, quoted in LA Times, 1996
• MS Answer Wizards, (printer) troubleshooters• Medical diagnosis• Genetic pedigree analysis• Speech recognition (HMMs)• Gene sequence/expression analysis• Turbocodes (channel coding)
![Page 13: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/13.jpg)
Dealing with time
• In many systems, data arrives sequentially• Dynamic Bayes nets (DBNs) can be used to
model such time-series (sequence) data• Special cases of DBNs include
– State-space models– Hidden Markov models (HMMs)
![Page 14: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/14.jpg)
State-space model (SSM)/Linear Dynamical System (LDS)
Y1 Y3
X1 X2 X3
Y2
“True” state
Noisy observations
![Page 15: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/15.jpg)
Example: LDS for 2D tracking
Y1 Y3
X1 X2X3
Y2
X1
X1 X2
X2
X1 X2
y1
y1 y2
y2
y2y1
oo
o o
Sparse linear Gaussian systems) sparse graphs
![Page 16: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/16.jpg)
Hidden Markov model (HMM)
Y1 Y3
X1 X2 X3
Y2
Phones/ words
acoustic signal
transitionmatrix
Gaussianobservations
Sparse transition matrix ) sparse graph
![Page 17: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/17.jpg)
Probabilistic graphical modelsProbabilistic models
Directed Undirected
Graphical models
Alarm networkState-space modelsHMMsNaïve Bayes classifierPCA/ ICA
Markov Random FieldBoltzmann machineIsing modelMax-ent modelLog-linear models
(Bayesian belief nets) (Markov nets)
![Page 18: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/18.jpg)
Many PatternRecognitionMethods areinstances of
graphicalmodels
![Page 19: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/19.jpg)
Inference• Posterior probabilities
– Probability of any event given any evidence
• Most likely explanation– Scenario that explains evidence
• Rational decision making– Maximize expected utility– Value of Information
• Effect of intervention– Causal analysis
Earthquake
Radio
Burglary
Alarm
Call
Radio
Call
Figure from N. Friedman
Explaining away effect
![Page 20: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/20.jpg)
Wet inference
• Q: Is grass wet due to sprinkler or rain?
![Page 21: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/21.jpg)
Efficient Evaluation
![Page 22: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/22.jpg)
![Page 23: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/23.jpg)
Kalman filtering as a BeliefNetwork
Y1 Y3
X1 X2X3
Y2
Estimate P(Xt|y1:t) from P(Xt-1|y1:t-1) and yt
•Predict: P(Xt|y1:t-1) = sXt-1 P(Xt|Xt-1) P(Xt-1|y1:t-1)•Update: P(Xt|y1:t) / P(yt|Xt) P(Xt|y1:t-1)
![Page 24: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/24.jpg)
Belief Propagation
rootroot
Collect
rootroot
Distribute
Figure from P. Green
Generalization of forwards-backwards algo. /RTS smoother from chains to trees - linear time, two-pass algorithm
aka Pearl’s algorithm, sum-product algorithm
![Page 25: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/25.jpg)
Message passing view offorwards algorithm
Yt-1 Yt+1
Xt-1 XtXt+1
Yt
at|t-1
btbt+1
![Page 26: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/26.jpg)
Forwards-backwards algorithm
Yt-1 Yt+1
Xt-1 Xt Xt+1
Yt
at|t-1bt
bt
Discrete analog of RTS smoother
![Page 27: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/27.jpg)
BP: parallel, distributed version
X1
X2
X3 X4
X1
X2
X3 X4
Stage 1. Stage 2.
![Page 28: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/28.jpg)
Inference in general graphs• BP is only guaranteed to be correct for trees• A general graph should be converted to a
junction tree, by clustering nodes• Computationally complexity is exponential
in size of the resulting clusters (NP-hard)
![Page 29: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/29.jpg)
Approximate inference• Why?
– to avoid exponential complexity of exact inference indiscrete loopy graphs
– Because cannot compute messages in closed form(even for trees) in the non-linear/non-Gaussian case
• How?– Deterministic approximations: loopy BP, mean field,
structured variational, etc– Stochastic approximations: MCMC (Gibbs sampling),
likelihood weighting, particle filtering, etc
- Algorithms make different speed/accuracy tradeoffs
- Should provide the user with a choice of algorithms
![Page 30: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/30.jpg)
Learning
• Parameter estimation• Model selection (structure learning)
![Page 31: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/31.jpg)
Parameter learning
Figure from M. Jordan
Conditional Probability Tables (CPTs)
1?11?1
…
110111
000010
X6X5X4X3X2X1
iid data
If some prob. values are missing(latent variables), we must use some method (e.g. gradient descent or EM) to compute the (locally) maximum likelihood estimates
![Page 32: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/32.jpg)
Structure learning (which nodesare connected)
Gene expression data
Figure from N. Friedman
Genetic pathway
![Page 33: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They](https://reader033.vdocuments.mx/reader033/viewer/2022041921/5e6c258fd0150667ea71b9c9/html5/thumbnails/33.jpg)
Structure learning•Learning the optimal structure is NP-hard (except for trees)•Hence use heuristic search through space of DAGs or PDAGs or node orderings•Search algorithms: hill climbing, simulated annealing, GAs•Scoring function is often marginal likelihood, or an approximation like BIC/MDL or AIC
Structural complexity penalty