part ii: graphical models. challenges of probabilistic models specifying well-defined probabilistic...

90
Part II: Graphical models

Post on 18-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Part II: Graphical models

Page 2: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Challenges of probabilistic models

• Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Representing probability distributions over those variables is hard (for computers/learners)

• Computing quantities using those distributions is hard (for computers/learners)

Page 3: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Representing structured distributions

Four random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

Domain

{0,1}{0,1}{0,1}{0,1}

Page 4: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Joint distribution0000000100100011010001010110011110001001101010111100110111101111

• Requires 15 numbers to specify probability of all values x1,x2,x3,x4

– N binary variables, 2N-1 numbers

• Similar cost when computing conditional probabilities

P(x2, x3, x4 | x1 =1) =P(x1 =1,x2,x3,x4 )

P(x1 =1)

Page 5: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

How can we use fewer numbers?

Four random variables:

X1 coin toss produces heads

X2 coin toss produces heads

X3 coin toss produces heads

X4 coin toss produces heads

Domain

{0,1}{0,1}{0,1}{0,1}

Page 6: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Statistical independence• Two random variables X1 and X2 are independent if

P(x1|x2) = P(x1)

– e.g. coinflips: P(x1=H|x2=H) = P(x1=H) = 0.5

• Independence makes it easier to represent and work with probability distributions

• We can exploit the product rule:

P(x1,x2, x3, x4 ) = P(x1 | x2,x3,x4 )P(x2 | x3, x4 )P(x3 | x4 )P(x4 )

P(x1, x2, x3, x4 ) = P(x1)P(x2)P(x3)P(x4 )

If x1, x2, x3, and x4 are all independent…

Page 7: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Expressing independence

• Statistical independence is the key to efficient probabilistic representation and computation

• This has led to the development of languages for indicating dependencies among variables

• Some of the most popular languages are based on “graphical models”

Page 8: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Part II: Graphical models

• Introduction to graphical models– representation and inference

• Causal graphical models– causality– learning about causal relationships

• Graphical models and cognitive science– uses of graphical models– an example: causal induction

Page 9: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Part II: Graphical models

• Introduction to graphical models– representation and inference

• Causal graphical models– causality– learning about causal relationships

• Graphical models and cognitive science– uses of graphical models– an example: causal induction

Page 10: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Graphical models

• Express the probabilistic dependency structure among a set of variables (Pearl, 1988)

• Consist of– a set of nodes, corresponding to variables– a set of edges, indicating dependency– a set of functions defined on the graph that

specify a probability distribution

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 11: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Undirected graphical models

• Consist of– a set of nodes– a set of edges– a potential for each clique, multiplied together to yield the distribution over variables

• Examples– statistical physics: Ising model, spinglasses– early neural networks (e.g. Boltzmann machines)

X1

X2

X3 X4

X5

Page 12: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Directed graphical modelsX3 X4

X5

X1

X2

• Consist of– a set of nodes– a set of edges– a conditional probability distribution for each node, conditioned on its parents, multiplied together

to yield the distribution over variables

• Constrained to directed acyclic graphs (DAGs)• Called Bayesian networks or Bayes nets

Page 13: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Bayesian networks and Bayes

• Two different problems– Bayesian statistics is a method of inference– Bayesian networks are a form of representation

• There is no necessary connection– many users of Bayesian networks rely upon

frequentist statistical methods– many Bayesian inferences cannot be easily

represented using Bayesian networks

Page 14: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Properties of Bayesian networks

• Efficient representation and inference– exploiting dependency structure makes it easier

to represent and compute with probabilities

• Explaining away– pattern of probabilistic reasoning characteristic

of Bayesian networks, especially early use in AI

Page 15: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Properties of Bayesian networks

• Efficient representation and inference– exploiting dependency structure makes it easier

to represent and compute with probabilities

• Explaining away– pattern of probabilistic reasoning characteristic

of Bayesian networks, especially early use in AI

Page 16: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Efficient representation and inferenceFour random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

Page 17: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

The Markov assumption

Every node is conditionally independent of its non-descendants, given its parents

P(xi | xi+1,...,xk ) = P(xi | Pa(X i ))

where Pa(Xi) is the set of parents of Xi

P(x1,..., xk ) = P(x i | Pa(X i))i=1

k

(via the product rule)

Page 18: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Efficient representation and inferenceFour random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

1 1

24

total = 7 (vs 15)

Page 19: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Reading a Bayesian network

• The structure of a Bayes net can be read as the generative process behind a distribution

• Gives the joint probability distribution over variables obtained by sampling each variable conditioned on its parents

Page 20: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Reading a Bayesian networkFour random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

Page 21: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Reading a Bayesian network

• The structure of a Bayes net can be read as the generative process behind a distribution

• Gives the joint probability distribution over variables obtained by sampling each variable conditioned on its parents

• Simple rules for determining whether two variables are dependent or independent

• Independence makes inference more efficient

Page 22: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Computing with Bayes nets

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

P(x2, x3,x4 | x1 =1) =P(x1 =1, x2,x3, x4 )

P(x1 =1)

Page 23: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Computing with Bayes nets

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

P(x1 =1) = P(x1 =1, x2,x3,x4 )x2 ,x3 ,x4

∑sum over 8 values

Page 24: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Computing with Bayes nets

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

P(x1 =1) = P(x1 | x3, x4 )P(x2 | x3)P(x3)P(x4 )x2 ,x3 ,x4

Page 25: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Computing with Bayes nets

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)

P(x1 =1) = P(x1 | x3, x4 )P(x3)P(x4 )x3 ,x4

∑sum over 4 values

Page 26: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Computing with Bayes nets

• Inference algorithms for Bayesian networks exploit dependency structure

• Message-passing algorithms– “belief propagation” passes simple messages between

nodes, exact for tree-structured networks

• More general inference algorithms– exact: “junction-tree”– approximate: Monte Carlo schemes (see Part IV)

Page 27: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Properties of Bayesian networks

• Efficient representation and inference– exploiting dependency structure makes it easier

to represent and compute with probabilities

• Explaining away– pattern of probabilistic reasoning characteristic

of Bayesian networks, especially early use in AI

Page 28: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Assume grass will be wet if and only if it rained last night, or if the sprinklers were left on:

Explaining away

Rain Sprinkler

Grass Wet

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 29: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

)(

)()|()|(

wP

rPrwPwrP =

Compute probability it rained last night, given that the grass is wet:

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 30: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

∑′′

′′′′=

sr

srPsrwP

rPrwPwrP

,

),(),|(

)()|()|(

Compute probability it rained last night, given that the grass is wet:

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 31: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

),(),(),(

)()|(

srPsrPsrP

rPwrP

¬+¬+=

Compute probability it rained last night, given that the grass is wet:

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 32: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

Compute probability it rained last night, given that the grass is wet:

),()(

)()|(

srPrP

rPwrP

¬+=

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 33: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

)()()(

)()|(

sPrPrP

rPwrP

¬+=

Compute probability it rained last night, given that the grass is wet:

Between 1 and P(s)

)(rP>

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 34: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

Compute probability it rained last night, given that the grass is wet and sprinklers were left on:

)|(

)|(),|(),|(

swP

srPsrwPswrP =

Both terms = 1

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 35: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

Compute probability it rained last night, given that the grass is wet and sprinklers were left on:

)(rP=)|(),|( srPswrP =

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 36: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Explaining away

Rain Sprinkler

Grass Wet

)(rP=)|(),|( srPswrP =)()()(

)()|(

sPrPrP

rPwrP

¬+= )(rP>

“Discounting” to prior probability.

.andif0 sSrR ¬=¬==

),|()()(),,( RSWPSPRPWSRP =

rRsSRSwWP ==== orif1),|(

Page 37: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Formulate IF-THEN rules:– IF Rain THEN Wet– IF Wet THEN Rain

• Rules do not distinguish directions of inference• Requires combinatorial explosion of rules

Contrast w/ production system

Rain

Grass Wet

Sprinkler

IF Wet AND NOT Sprinkler THEN Rain

Page 38: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Observing rain, Wet becomes more active. • Observing grass wet, Rain and Sprinkler become more active• Observing grass wet and sprinkler, Rain cannot become less active. No explaining away!

• Excitatory links: Rain Wet, Sprinkler Wet

Contrast w/ spreading activation

Rain Sprinkler

Grass Wet

Page 39: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Observing grass wet, Rain and Sprinkler become more active• Observing grass wet and sprinkler, Rain becomes less active: explaining away

• Excitatory links: Rain Wet, Sprinkler Wet• Inhibitory link: Rain Sprinkler

Contrast w/ spreading activation

Rain Sprinkler

Grass Wet

Page 40: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Each new variable requires more inhibitory connections• Not modular

– whether a connection exists depends on what others exist– big holism problem – combinatorial explosion

Contrast w/ spreading activationRain

Sprinkler

Grass Wet

Burst pipe

Page 41: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Contrast w/ spreading activation

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

(McClelland & Rumelhart, 1981)

Page 42: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Graphical models

• Capture dependency structure in distributions

• Provide an efficient means of representing and reasoning with probabilities

• Allow kinds of inference that are problematic for other representations: explaining away– hard to capture in a production system– more natural than with spreading activation

Page 43: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Part II: Graphical models

• Introduction to graphical models– representation and inference

• Causal graphical models– causality– learning about causal relationships

• Graphical models and cognitive science– uses of graphical models– an example: causal induction

Page 44: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Causal graphical models

• Graphical models represent statistical dependencies among variables (ie. correlations)– can answer questions about observations

• Causal graphical models represent causal dependencies among variables (Pearl, 2000)

– express underlying causal structure– can answer questions about both observations and

interventions (actions upon a variable)

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 45: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Bayesian networks

Four random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

Nodes: variablesLinks: dependency

Each node has a conditional probability distribution

Data: observations of x1, ..., x4

Page 46: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Causal Bayesian networks

Four random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

Nodes: variablesLinks: causality

Each node has a conditional probability distribution

Data: observations of and interventions on x1, ..., x4

Page 47: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Interventions

Four random variables:

X1 coin toss produces heads

X2 pencil levitates

X3 friend has psychic powers

X4 friend has two-headed coin

X1 X2

X3X4P(x4) P(x3)

P(x2|x3)P(x1|x3, x4)

hold down pencil

X

Cut all incoming links for the node that we intervene on

Compute probabilities with “mutilated” Bayes net

Page 48: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Strength: how strong is a relationship?• Structure: does a relationship exist?

E

B C

E

B CB B

Learning causal graphical models

Page 49: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Strength: how strong is a relationship?

E

B C

E

B CB B

Causal structure vs. causal strength

Page 50: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Strength: how strong is a relationship?– requires defining nature of relationship

E

B C

w0 w1

E

B C

w0

B B

Causal structure vs. causal strength

Page 51: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Parameterization

• Structures: h1 = h0 =

• Parameterization:

E

B C

E

B C

C B

0 01 00 11 1

h1: P(E = 1 | C, B) h0: P(E = 1| C, B)

p00

p10

p01

p11

p0

p0

p1

p1

Generic

Page 52: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Parameterization

• Structures: h1 = h0 =

• Parameterization:

E

B C

E

B C

w0 w1w0

w0, w1: strength parameters for B, C

C B

0 01 00 11 1

h1: P(E = 1 | C, B) h0: P(E = 1| C, B)

0w1

w0

w1+ w0

00w0

w0

Linear

Page 53: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Parameterization

• Structures: h1 = h0 =

• Parameterization:

E

B C

E

B C

w0 w1w0

w0, w1: strength parameters for B, C

C B

0 01 00 11 1

h1: P(E = 1 | C, B) h0: P(E = 1| C, B)

0w1

w0

w1+ w0 – w1 w0

00w0

w0

“Noisy-OR”

Page 54: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Parameter estimation

• Maximum likelihood estimation:

maximize i P(bi,ci,ei; w0, w1)

• Bayesian methods: as in Part I

Page 55: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Structure: does a relationship exist?

E

B C

E

B CB B

Causal structure vs. causal strength

Page 56: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Approaches to structure learning

• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies E

B CB

(Pearl, 2000; Spirtes et al., 1993)

Page 57: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Approaches to structure learning

E

B CB• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies

(Pearl, 2000; Spirtes et al., 1993)

Page 58: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Approaches to structure learning

E

B CB• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies

(Pearl, 2000; Spirtes et al., 1993)

Page 59: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Approaches to structure learning

E

B CB

Attempts to reduce inductive problem to deductive problem

• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies

(Pearl, 2000; Spirtes et al., 1993)

Page 60: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Approaches to structure learning

E

B CB

• Bayesian:– compute posterior

probability of structures,

given observed dataE

B C

E

B C

P(h|data) P(data|h) P(h)

P(h1|data) P(h0|data)

• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies

(Pearl, 2000; Spirtes et al., 1993)

(Heckerman, 1998; Friedman, 1999)

Page 61: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Bayesian Occam’s Razor

All possible data sets d

P(d

| h

)

h0 (no relationship)

h1 (relationship)

P(d

∑ d | h) = 1For any model h,

Page 62: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Causal graphical models

• Extend graphical models to deal with interventions as well as observations

• Respecting the direction of causality results in efficient representation and inference

• Two steps in learning causal models– strength: parameter estimation– structure: structure learning

Page 63: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Part II: Graphical models

• Introduction to graphical models– representation and inference

• Causal graphical models– causality– learning about causal relationships

• Graphical models and cognitive science– uses of graphical models– an example: causal induction

Page 64: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Uses of graphical models

• Understanding existing cognitive models– e.g., neural network models

• Representation and reasoning– a way to address holism in induction (c.f. Fodor)

• Defining generative models– mixture models, language models (see Part IV)

• Modeling human causal reasoning

Page 65: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Human causal reasoning

• How do people reason about interventions?(Gopnik, Glymour, Sobel, Schulz, Kushnir & Danks, 2004; Lagnado

& Sloman, 2004; Sloman & Lagnado, 2005; Steyvers, Tenenbaum, Wagenmakers & Blum, 2003)

• How do people learn about causal relationships?– parameter estimation (Shanks, 1995; Cheng, 1997)

– constraint-based models (Glymour, 2001)

– Bayesian structure learning (Steyvers et al., 2003; Griffiths & Tenenbaum, 2005)

Page 66: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Causation from contingencies

“Does C cause E?”(rate on a scale from 0 to 100)

E present (e+)

E absent (e-)

C present(c+)

C absent(c-)

a

b

c

d

Page 67: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Two models of causal judgment

• Delta-P (Jenkins & Ward, 1965):

• Power PC (Cheng, 1997):

dccbaacePcePP +−+=−≡Δ −+++ )()|()|(

)()|(1 dcd

P

ceP

Pp

=−

Δ≡ −+Power

Page 68: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Buehner and Cheng (1997)

People

P

Power

0.00 0.25 0.50 0.75 1.00 P

Page 69: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Buehner and Cheng (1997)

People

P

Power

Constant ΔP, changing judgments

Page 70: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Buehner and Cheng (1997)

People

P

Power

Constant causal power, changing judgments

Page 71: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Buehner and Cheng (1997)

People

P

Power

ΔP = 0, changing judgments

Page 72: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Strength: how strong is a relationship?• Structure: does a relationship exist?

E

B C

w0 w1

E

B C

w0

B B

Causal structure vs. causal strength

Page 73: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Causal strength

• Assume structure:

ΔP and causal power are maximum likelihood estimates of the strength parameter w1, under different parameterizations for P(E|B,C): – linear ΔP, Noisy-OR causal power

E

B C

w0 w1

B

Page 74: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

• Hypotheses: h1 = h0 =

• Bayesian causal inference:

support =

E

B C

E

B CB B

P(d | h1) = P(d | w0,w1) p(w0,w1 | h1)0

1

∫0

1

∫ dw0 dw1

P(d | h0) = P(d | w0) p(w0 | h0)0

1

∫ dw0

Causal structure

P(d|h1)

P(d|h0)likelihood ratio (Bayes factor) gives evidence in favor of h1

Page 75: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

People

ΔP (r = 0.89)

Power (r = 0.88)

Support (r = 0.97)

Buehner and Cheng (1997)

Page 76: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

The importance of parameterization

• Noisy-OR incorporates mechanism assumptions:– generativity: causes increase probability of effects

– each cause is sufficient to produce the effect

– causes act via independent mechanisms(Cheng, 1997)

• Consider other models:– statistical dependence: 2 test

– generic parameterization (cf. Anderson, 1990)

Page 77: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

People

Support (Noisy-OR)

2

Support (generic)

Page 78: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Generativity is essential

• Predictions result from “ceiling effect”– ceiling effects only matter if you believe a cause increases the

probability of an effect

P(e+|c+)P(e+|c-)

8/88/8

6/86/8

4/84/8

2/82/8

0/80/8

Support 10050

0

Page 79: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Blicket detector (Dave Sobel, Alison Gopnik, and colleagues)

See this? It’s a blicket machine. Blickets make it go.

Let’s put this oneon the machine.

Oooh, it’s a blicket!

Page 80: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

– Two objects: A and B– Trial 1: A B on detector – detector active– Trial 2: A on detector – detector active– 4-year-olds judge whether each object is a blicket

• A: a blicket (100% say yes)

• B: probably not a blicket (34% say yes)

“Backwards blocking” (Sobel, Tenenbaum & Gopnik, 2004)

AB Trial A TrialA B

Page 81: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Possible hypotheses

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

E

BA

Page 82: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Bayesian inference

• Evaluating causal models in light of data:

• Inferring a particular causal relation:

P(hi | d) =P(d | hi)P(hi)

P(d | h j )P(h j )j

∑∈

→=→H

jh

jj dhPhEAPdEAP )|()|()|(

Page 83: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Bayesian inference

With a uniform prior on hypotheses, and the generic parameterization

A B

Probability of being a blicket

0.32 0.32

0.34 0.34

Page 84: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Modeling backwards blocking

Assume…

• Links can only exist from blocks to detectors

• Blocks are blickets with prior probability q

• Blickets always activate detectors, but detectors never activate on their own– deterministic Noisy-OR, with wi = 1 and w0 = 0

Page 85: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Modeling backwards blocking

P(E=1 | A=0, B=0): 0 0 0 0

P(E=1 | A=1, B=0): 0 0 1 1P(E=1 | A=0, B=1): 0 1 0 1P(E=1 | A=1, B=1): 0 1 1 1

E

BA

E

BA

E

BA

E

BA

P(h00) = (1 – q)2 P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2

P(B → E | d) = P(h01) + P(h11) = q

Page 86: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

P(B → E | d) =P(h01) + P(h11)

P(h01) + P(h10) + P(h11)=

q

q + q(1− q)

P(E=1 | A=1, B=1): 0 1 1 1

E

BA

E

BA

E

BA

E

BA

P(h00) = (1 – q)2 P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2

Modeling backwards blocking

Page 87: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

P(B → E | d) =P(h11)

P(h10) + P(h11)= q

P(E=1 | A=1, B=0): 0 1 1

P(E=1 | A=1, B=1): 1 1 1

E

BA

E

BA

E

BA

P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2

Modeling backwards blocking

Page 88: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Manipulating prior probability(Tenenbaum, Sobel, Griffiths, & Gopnik, submitted)

AB Trial A TrialInitial

Page 89: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)

Summary

• Graphical models provide solutions to many of the challenges of probabilistic models– defining structured distributions– representing distributions on many variables– efficiently computing probabilities

• Causal graphical models provide tools for defining rational models of human causal reasoning and learning

Page 90: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)