1 chapter 8 inference and resolution for problem solving

23
1 Chapter 8 Inference and Resolution for Problem Solving

Upload: byron-greene

Post on 31-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Chapter 8 Inference and Resolution for Problem Solving

1

Chapter 8

Inference and Resolution for Problem Solving

Page 2: 1 Chapter 8 Inference and Resolution for Problem Solving

2

Chapter 8 Contents (1)

Normal Forms Converting to CNF Clauses The Resolution Rule Resolution Refutation Proof by Refutation Example: Graph Coloring

Page 3: 1 Chapter 8 Inference and Resolution for Problem Solving

3

Chapter 8 Contents (2)

Normal Forms in Predicate Calculus Skolemization Unification The Unification algorithm The Resolution Algorithm Horn Clauses in PROLOG

Page 4: 1 Chapter 8 Inference and Resolution for Problem Solving

4

Normal Forms A wff is in Conjunctive Normal Form (CNF) if it

is a conjunction of disjunctions: A1 Λ A2 Λ A3 Λ … Λ An

where each clause, Ai, is of the form: B1 v B2 v B3 v … v Bn

The B’s are literals. E.g.: A Λ (B v C) Λ (¬A v ¬B v ¬C v D) Similarly, a wff is in Disjunctive Normal Form

(DNF) if it is a disjunction of conjunctions. E.g.: A v (B Λ C) v (¬A Λ ¬B Λ ¬C Λ D)

Page 5: 1 Chapter 8 Inference and Resolution for Problem Solving

5

Converting to CNF Any wff can be converted to CNF by using the

following equivalences:(1) A ↔ B (A → B) Λ (B → A)(2) A → B ¬A v B(3) ¬(A Λ B) ¬A v ¬B(4) ¬(A v B) ¬A Λ ¬B(5) ¬¬A A (6) A v (B Λ C) (A v B) Λ (A v C) Importantly, this can be converted into an algorithm – this

will be useful when when we come to automating resolution.

Page 6: 1 Chapter 8 Inference and Resolution for Problem Solving

6

Clauses Having converted a wff to CNF, it is usual to write

it as a set of clauses. E.g.:

(A → B) → C In CNF is:

(A V C) Λ (¬B V C) In clause form, we write:

{(A, C), (¬B, C)}

Page 7: 1 Chapter 8 Inference and Resolution for Problem Solving

7

The Resolution Rule

The resolution rule is written as follows:

A v B ¬ B v CA v C

This tells us that if we have two clauses that have a literal and its negation, we can combine them by removing that literal.

E.g.: if we have {(A, C), (¬A, D)} We would apply resolution to get {C, D}

Page 8: 1 Chapter 8 Inference and Resolution for Problem Solving

8

Resolution Refutation

Let us resolve: {(¬A, B), (¬A, ¬B, C), A, ¬C} We begin by resolving the first clause with the second

clause, thus eliminating B and ¬B:{(¬A, C), A, ¬ C}{C, ¬C}

Now we can resolve both remaining literals, which gives falsum:

If we reach falsum, we have proved that our initial set of clauses were inconsistent.

This is written:{(¬A, B), (¬A, ¬B, C), A, ¬C} ╞

Page 9: 1 Chapter 8 Inference and Resolution for Problem Solving

9

Proof by Refutation

If we want to prove that a logical argument is valid, we negate its conclusion, convert it to clause form, and then try to derive falsum using resolution.

If we derive falsum, then our clauses were inconsistent, meaning the original argument was valid, since we negated its conclusion.

Page 10: 1 Chapter 8 Inference and Resolution for Problem Solving

10

Proof by Refutation - Example

Our argument is: (A Λ ¬B) → CA Λ ¬ BC

Negate the conclusion and convert to clauses:{(¬A, B, C), A, ¬B, ¬C}

Now resolve:{(B, C), ¬B, ¬C}{C, ¬C}

We have reached falsum, so our original argument was valid.

Page 11: 1 Chapter 8 Inference and Resolution for Problem Solving

11

Refutation Proofs in Tree Form

It is often sensible to represent a refutation proof in tree form:

In this case, the proof has failed, as weare left with E instead of falsum.

Page 12: 1 Chapter 8 Inference and Resolution for Problem Solving

12

Example: Graph Coloring

Resolution refutation can be used to determine if a solution exists for a particular combinatorial problem.

For example, for graph coloring, we represent the assignment of colors to the nodes and the constraints regarding edges as propositions, and attempt to prove that the complete set of clauses is consistent.

This does not tell us how to color the graph, simply that it is possible.

Page 13: 1 Chapter 8 Inference and Resolution for Problem Solving

13

Example: Graph Coloring (2)

Available colors are r (red), g (green) and b (blue). Ar means that node A has been coloured red. Each node must have exactly one color:

Ar v Ag v Ab

¬Ar v ¬Ag ( Ar → ¬Ag)

¬Ag v ¬Ab

¬Ab v ¬Ar

If (A,B) is an edge, then:

¬Ar v ¬Br

¬Ab v ¬Bb

¬Ag v ¬Bg

Now we construct the complete set of clauses for our graph, and try to see if they are consistent, using resolution.

Page 14: 1 Chapter 8 Inference and Resolution for Problem Solving

14

Normal Forms in Predicate Calculus

A FOPC expression can be put in prenex normal form by converting it to CNF, with the quantifiers moved to the front.

For example, the wff:(x)(A(x) → B(x)) → (y)(A(y) Λ B(y))

Converts to prenex normal form as:(x)(y)(((A(x) v A(y)) Λ (¬B(x) v

A(y)) Λ (A(x) v B(y)) Λ (¬B(x) v B(y))))

Page 15: 1 Chapter 8 Inference and Resolution for Problem Solving

15

Conversion to Prenex detail

(x)(A(x)→B(x))→(y)(A(y)ΛB(y)) ¬(x)(A(x)→B(x))v(y)(A(y)ΛB(y)) ¬(x)(¬A(x)vB(x))v(y)(A(y)ΛB(y)) (x)¬(¬A(x)vB(x))v(y)(A(y)ΛB(y)) (x)(¬¬A(x)Λ¬B(x))v(y)(A(y)ΛB(y)) (x)(A(x)Λ¬B(x))v(y)(A(y)ΛB(y)) (x)(y)(A(x)Λ¬B(x))v(A(y)ΛB(y))) (x)(y)(((A(x)vA(y))Λ(¬B(x)vA(y))Λ(A(x)vB(y))Λ(¬B(x)v B(y))))

Page 16: 1 Chapter 8 Inference and Resolution for Problem Solving

16

Skolemization

Before resolution can be applied, must be removed, using skolemization.

Variables that are existentially quantified are replaced by a constant:

(x) P(x) is converted to:

P(c) c must not already exist in the expression.

Page 17: 1 Chapter 8 Inference and Resolution for Problem Solving

17

Skolem functions

If the existentially quantified variable is within the scope of a universally quantified variable, it must be replaced by a skolem function, which is a function of the universally quantified variable.

Sounds complicated, but is actually simple:

(x)(y)(P(x,y)) Is Skolemized to give:

(x)(P(x,f(x)) After skolemization, is dropped, and the expression

converted to clauses in the usual manner.

Page 18: 1 Chapter 8 Inference and Resolution for Problem Solving

18

Unification (1)

To resolve clauses we often need to make substitutions. For example:

{(P(w,x)), (¬P(y,z))} To resolve, we need to substitute y for w,

and z for x, giving:

{(P(y,z)), (¬P(y,z))} Now these resolve to give falsum.

Page 19: 1 Chapter 8 Inference and Resolution for Problem Solving

19

Unification (2)

A substitution that enables us to resolve a set of clauses is called a unifier.

We write the unifier we used above as:{y/w, z/x}

A unifier (u) is a most general unifier (mgu) if any other unifier can be formed by the composition of u with some other unifier.

An algorithm can be generated which obtains a most general unifier for any set of clauses.

Page 20: 1 Chapter 8 Inference and Resolution for Problem Solving

20

The Resolution Algorithm

1 First, negate the conclusion and add it to the list of assumptions.

2 Now convert the assumptions into Prenex Normal Form

3 Next, skolemize the resulting expression4 Now convert the expression into a set of clauses5 Now resolve the clauses using suitable unifiers. This algorithm means we can write programs that

automatically prove theorems using resolution.

Page 21: 1 Chapter 8 Inference and Resolution for Problem Solving

21

Horn Clauses in PROLOG (1)

PROLOG uses resolution. A Horn clause has at most one positive

literal:A v ¬B v ¬C v ¬D v ¬E …

This can also be written as an implication:B Λ C Λ D Λ E → A

In PROLOG, this is written:A :- B, C, D, E

Page 22: 1 Chapter 8 Inference and Resolution for Problem Solving

22

Horn Clauses in PROLOG (2)

Horn clauses can express rules:A :- B, C, D

Or facts:A :-

Or goals::- B, C, D, E

If a set of clauses is valid, PROLOG will definitely prove it using resolution and depth first search.

Page 23: 1 Chapter 8 Inference and Resolution for Problem Solving

23

Skolemization Example

(x)([a(X)Λb(X)]→[c(X,I)Λ(X)((Z)[c(X,Z)]→d(X,X))])v(X)(e(X))

(x)([a(X)Λb(X)]→[c(X,I)Λ(Y)((Z)[c(Y,Z)]→d(X,Y))])v(X)(e(X))

(x)(¬[a(X)Λb(X)]v[c(X,I)Λ(Y)((Z)[c(Y,Z)]→d(X,Y))])v(X)(e(X))

(x)(¬[a(X)Λb(X)]v[c(X,I)Λ(Y)((Z)[¬c(Y,Z)]vd(X,Y))])v(X)(e(X))

(x)([¬a(X)v¬b(X)]v[c(X,I)Λ(Y)((Z)[¬c(Y,Z)]vd(X,Y))])v(X)(e(X))

(x)([¬a(X)v¬b(X)]v[c(X,I)Λ(Y)((Z)[¬c(Y,Z)]vd(X,Y))])v(W)(e(W))

(x)(Y)(Z)(W)([¬a(X)v¬b(X)]v[c(X,I)Λ(¬c(Y,Z)vd(X,Y))]ve(W))

(x)(W)([¬a(X)v¬b(X)]v[c(X,I)Λ(¬c(f(X),g(X))vd(X,f(X)))])ve(W))

[¬a(X)v¬b(X)]v[c(X,I)Λ(¬c(f(X),g(X))vd(X,f(X)))])ve(W)

[¬a(X)v¬b(X)vc(X,I)ve(W)] Λ[¬a(X)v¬b(X)v¬c(f(X),g(X))vd(X,f(X))ve(W)

[¬a(X)v¬b(X)vc(X,I)ve(W)] Λ[¬a(U)v¬b(U)v¬c(f(U),g(U))vd(U,f(U))ve(V)