introduction to bayesian networks. representation: what is the joint probability distribution p(b,...

22
Introduction to Bayesian Networks

Upload: andrew-montgomery

Post on 03-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Introduction to Bayesian Networks

Page 2: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Representation: What is the joint probability distribution

P(B, E, R, A, N) over five binary variables?

• How many states in total?

• How many free parameters do we need?

• How many table entries should we sum over to get P(R=r)?

Joint Probability DistributionJoint Probability Distribution

Page 3: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Joint Probability DistributionJoint Probability Distribution

• Full joint distribution is sufficient to represent the complete domain and to do any type of probabilistic inferences

• Problems: n - number of random variables, d – number of values

– Space complexity. A full joint distribution requires to remember O(dn) numbers

– Inference complexity. Computing queries requires O(dn) steps

– Acquisition problem. Who is going to define all the probabilities?

Page 4: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Joint Probability DistributionJoint Probability Distribution

),,,,( NAREBP

Page 5: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Joint Probability DistributionJoint Probability Distribution

),,,|(),,|(),|()|()(),,,,( AREBNPREBAPEBRPBEPBPNAREBP

Page 6: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Joint Probability DistributionJoint Probability Distribution

VS

Page 7: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Bayesian networksBayesian networks

• A Bayesian Network is a graph in which:– A set of random variables makes up the nodes in the network.

– A set of directed links or arrows connects pairs of nodes.

– Each node has a conditional probability table that quantifies the effects the parents have on the node.

– Directed, acyclic graph (DAG), i.e. no directed cycles.

A Bayesian network modeling the joint probability distribution P(B, E, R, A, N)

P(B) P(E)

P(R|E)P(A|B,E)

P(N|A)

Page 8: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Conditional probability tables (CPT):

Bayesian NetworksBayesian Networks

P(B) P(E)

P(N|A)

P(R|E)P(A|B,E)

Page 9: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Independences in BNsIndependences in BNs

• Three basic independence structures:

Page 10: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Independences in BNsIndependences in BNs

• Indirect cause: Burglary is independent of NeighborCall given Alarm

– P(N|A, B) = P(N|A)

– P(N, B|A) = P(N|A)P(B|A)

Page 11: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Independences in BNsIndependences in BNs

• Common cause: Alarm is independent of RadioAnnounce given Earthquake

– P(A|E, R) = P(A|E)

– P(A, R|E) = P(A|E)P(R|E)

Page 12: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Independences in BNsIndependences in BNs

• Common effect: – Burglary is independent of Earthquake when Alarm is not known

– Burglary and Earthquake become dependent given Alarm!!!

Page 13: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Path blockingPath blocking

• With linear substructure

• With common cause structure

• With common effect structure

X YC in Z

X YC in Z

X Y

C or any of its descendants not in Z

Page 14: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Independences in BNsIndependences in BNs

• Earthquake ┴ Burglary | NeighborCall?• Burglary ┴ NeighborCall ?• Burglary ┴ RadioAnnounce | Earthquake?• Burglary ┴ RadioAnnounce | NeighborCall?

Page 15: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Markov AssumptionMarkov Assumption

• Each variable is independent on its non-descendants, given its parents in Bayesian networks

• N ┴ B, E, R | A• R ┴ B, A, N | E• …

Page 16: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Full Joint Distribution in BNsFull Joint Distribution in BNs

),,,,( NRAEBP

)()(),|()|()|( BPEPEBAPERPANP

Page 17: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Chain RuleChain Rule

n

iiin XPaXPXXXP

121 ))(|(),...,,(

Page 18: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Types of Inference TasksTypes of Inference Tasks

Compute the probability or most likely state of a set of query variables given observed values of some other variables

• Belief updating

• Most probable explanation (MPE) queries

• Maximum A Posteriori (MAP) queries

Page 19: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Inference DirectionInference Direction

• Types of inference by reasoning direction

Page 20: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

ExamplesExamples

• Diagnostic inferences: from effect to causes.

• Causal Inferences: from causes to effects.

• Intercausal Inferences:

• Mixed Inference:

Page 21: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

Influence diagramInfluence diagram

• Add action nodes and utility nodes to Bayesian networks to enable rational decision making

• Algorithm:For each value of action node

compute expected value of utility node given action, evidenceReturn MEU action

Page 22: Introduction to Bayesian Networks. Representation: What is the joint probability distribution P(B, E, R, A, N) over five binary variables? How many states

• Idea: compute value of acquiring each possible piece of evidence– Can be done directly from influence diagram

• Example: buying oil drilling rightsTwo blocks A and B, exactly one has oil, worth kPrior probabilities 0.5 each, mutually exclusive

Current price of each block is k/2“Consultant” offers accurate survey of A. Fair price?

• Solution: compute expected value of information= expected value of best action given the information

minus expected value of best action without information

• Survey may say “oil in A” or “no oil in A”, prob. 0.5 each (given!)

= [0.5 * value of “buy A” given “oil in A”+ 0.5 * value of “buy B” given “no oil in A”]- 0

= (0.5 * k/2) + (0.5 * k/2) - 0 = k/2

Value of informationValue of information

Oil

Survey

Buy

U