bayesian abductive logic programs sindhu raghavan raymond j. mooney the university of texas at...

39
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1

Upload: kurtis-markley

Post on 14-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Bayesian Abductive Logic Programs

Sindhu Raghavan Raymond J. Mooney

The University of Texas at Austin

1

Abduction

• Process of finding the best explanation for a set of observations (Peirce 1958)

• Inference of cause from effect• Applications

– Plan recognition– Medical diagnosis– Natural language understanding

2

Logical AbductionGiven:• Background knowledge, B, in the form of a set of (Horn)

clauses in first-order logic • Observations, O, in the form of atomic facts in first-order

logic

Find:• A hypothesis, H, a set of assumptions (atomic facts) that

logically entail the observations given the theory: B H O• Typically, best explanation is the one with the fewest

assumptions, e.g. minimizes |H|

3

Example - Plan RecognitionBackground knowledge B:go(person,loc) :- shopping(person,loc,item).go(person,loc) :- robbing(person,loc,instr).get(person,instr) :- robbing(person,loc,instr).get(person,instr) :- hunting(person,loc,instr).store(loc) :- shopping(person,loc,item).store(loc) :- robbing(person,loc,instr).gun(instr) :- robbing(person,loc,instr).gun(instr) :- hunting(person,loc,instr).

4

Sample Observations O

get(john,o1) gun(o1) go(john,p1) store(p1)

5

Abductive Proof 1

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

6

Abductive Proof 2

get(john,o1) gun(o1) go(john,p1) store(p1)

robbing(john,p1,o1)

7

Best Explanation

8

Explanation 1

hunting(john,s2,o1) shopping(john,p1,s1)

Explanation 1

hunting(john,s2,o1) shopping(john,p1,s1)

Explanation 2

robbing(john,p1,o1)

Explanation 2

robbing(john,p1,o1)

Best explanation makes fewest assumptions

Existing Work on Logical Abduction

• History of research from 70’s – 90’s. (Pople 1973; Levesque 1989; Ng and Mooney 1992)

• Abductive Logic Programming (ALP) (Kakas, Kowalski, and Toni, 1993)

– Formalization based on logic programming.

9

Problem with Logical Abduction

• Not handle uncertainty of assumptions and inferences.

• Unable to chose between explanations with the same number of assumptions based on probability.

10

Other Approaches to Abduction

• Probabilistic abduction using Bayesian networks (Pearl 1988).

– Unable to capture relational structure since Bayes nets are propositional in nature

• Probabilistic relational abduction using Markov Logic Networks (MLNs) (Kate and Mooney 2009).

– Does not use logical abduction, instead uses complex reverse implications to approximate it.

11

Bayesian Logic Programs (BLP) (Kersting and De Raedt, 2001)

• Bayesian Logic Programs (BLPs) combine first-order logic and Bayesian networks.

• Deductive logic programming used to construct structure of a Bayes net.

• Not suitable for problems requiring abductive logical inference to form a proof structure by making assumptions.

12

Bayesian Abductive Logic Programs(BALP)

BLPBLP ALPALP

BALPBALP

Suitable for tasks involving abductive reasoning – plan recognition, diagnosis, etc.

13

BLPs vs. BALPs

• Like BLPs, BALPs use logic programs as templates for constructing Bayesian networks.

• Unlike BLPs, BALPs uses logical abduction instead of deduction to construct the network.

14

Abduction in BALPs

• Given : A set of observation literals O = {O1, O2,….On}

• Compute all distinct abductive proofs of O.• Construct a Bayesian network using the resulting set

of proofs as in BLPs.• Perform probabilistic inference on the Bayesian

network to compute the best explanation.

15

Abductive Proof 1

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

16

Abductive Proof 2

get(john,o1) gun(o1) go(john,p1) store(p1)

robbing(john,p1,o1)

17

Resulting Bayes Net

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

18

Probabilistic Parameters

• As with BLPs, CPTs for Bayes net specified in first-order clauses.

• Noisy-and combining rule is used to specify the CPT for combining the conjuncts in the body of the clause– Reduces the number of parameters needed– Parameters can be learned from data

• Noisy-or combining rule is used to specify the CPT for combining the disjunctive contributions from different ground clauses with the same head– Models “explaining away”– Parameters can be learned from data

19

Resulting Bayes Net get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

20

noisy ornoisy or noisy or

noisy or

Probabilistic Inference

• Specify truth value of observed facts.• Compute the Most Probable Explanation

(MPE) to determine the most likely combination of truth values to all unknown literals given this evidence.

• Use standard Bayes-net package ELVIRA for inference.

21

Resulting Bayes Net

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

noisy ornoisy or noisy or

noisy or

22

Resulting Bayes Net

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

Observed facts

23

noisy ornoisy or noisy or

noisy or

Resulting Bayes Net

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

Observed facts

Query variables

24

noisy ornoisy or noisy or

noisy or

Resulting Bayes Net

get(john,o1) gun(o1) go(john,p1) store(p1)

hunting(john,s2,o1) shopping(john,p1,s1)

robbing(john,p1,o1)

Observed facts

Query variables

FALSE FALSE

TRUE

25

noisy ornoisy or noisy or

noisy or

Experimental Evaluation

26

Story Understanding Data• Recognizing plans from narrative text (Charniak and

Goldman 1991; Ng and Mooney 1992).

• Infer characters’ higher-level plans that explain their observed actions represented in logic.– “Fred went to the supermarket. He pointed a gun at the owner. He packed his bag.”

=> robbing– “Jack went to the supermarket. He found some milk on the shelf. He paid for it.” =>

shopping

• 25 development examples and 25 test examples• 12.6 observations per example.• Background knowledge base originally constructed for

ACCEL system (Ng and Mooney 1992).

27

Story Understanding Methodology

• Noisy-and and noisy-or parameters set to 0.9 and priors hand-tuned on development data.

• Multiple high-level plans per example are possible.

• MPE inference used to compute the best explanation.

• Computed precision, recall and F-measure.

28

Story Understanding Systems Evaluated

• BALPs• ACCEL – Simplicity (Ng and Mooney 1992)

– Logical abduction preferring fewest # of assumptions.

• ACCEL – Coherence (Ng and Mooney 1992)– Logical abduction that maximally connects observations– Specific to story understanding.

• Abductive MLNs (Kate and Mooney 2009)

29

Story Understanding Results

30

Monroe Data• Recognizing high level plans in an emergency

response domain. Developed by Blaylock and Allen (2005) to test a statistical n-gram

approach.

• 10 high level plans including setting up shelter, providing medical attention, clearing road wreck.

• Artificially generated using SHOP-2 HTN planner.• 1000 examples for evaluation.• 10.19 observation literals per example.• Single correct plan in each example.• Knowledge base constructed based on the domain

knowledge encoded in HTN. 31

Monroe Methodology

• Parameters were set as in story understanding.

• Computed marginal probabilities for all high level plans and selected the single one with the highest probability.

• Computed convergence score to compare with Blaylock and Allen results.• Convergence score is the fraction of examples for which

the top level plan schema (predicate only) was predicted accurately after seeing all observations.

32

Monroe Results

BALP Blaylock and Allen

Convergence score

98.8 94.2

33

Modified Monroe Data• When applying the Kate & Mooney (2009) abductive

MLN approach to Monroe, it resulted in explosively large ground networks.

• Simplified Monroe domain just enough to prevent these problems.

• Developed typed clauses effective for abductive MLNs.

• Weight learning was still not tractable, so weights set manually.

34

Modified Monroe Methodology

• Repeatedly measured the percentage of correct plans inferred (including correct arguments) after observing an increasing fraction of the actions in the plan.

35

Modified-Monroe Results

% observations seen by the systems

Acc

ura

cy

36

Ongoing & Future Work

• Automatic learning of BALP parameters from data.• Alternative MLN formulation more directly modeling

the BALP approach.– Preliminary results for story understanding and modified

Monroe are only slightly worse than BALP results, but much worse for original Monroe.

• Compare to other SRL approaches that have incorporated logical abduction (SLPs, PRISM) when applied to plan recognition.

• Evaluation on other datasets/tasks/domains.

37

Conclusions

• New SRL framework BALP that combines Bayesian Logic Programs and Abductive Logic Programming.

• Well suited for relational abductive reasoning tasks like plan recognition.

• Empirical results demonstrate advantages over existing methods.

38

Questions??

39