on a semantic subsumtion test - ii.uni.wroc.pljma/urugwaj.pdf · on a semantic subsumtion test...
Post on 28-Apr-2019
219 Views
Preview:
TRANSCRIPT
On a Semantic Subsumtion Test
Jerzy Marcinkowski
Jan Otop
Grzegorz Stelmaszek
Institute of Computer Science
University of Wroc law,
Wroc law, Poland
(LPAR 04)
Outline of the talk
• Preliminaries 5 min.
– subsumption
– architecture of a theorem prover
• Our work – subsumption 25 min.
– first idea
– a semantics of subsumption
– implementation
• Our work – matching 5 min.
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
llllllllllllllllll
E(f(x), v,f(c))
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
llllllllllllllllll
E(f(x), v,f(c))������������������>
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
llllllllllllllllll
E(f(x), v,f(c))������������������>
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
hhhhhhhhhhhhhhhhhhz
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
hhhhhhhhhhhhhhhhhhz
E(f(k), v,ff(c))
Subsumption - quick reminder
We say that clause C subsumes D if an instance of C is a fragment of D.
C
E(r , s , f(z))
E(s , v , z )
D
E(f(x), y, f(c))
E(y, f(k), fff(c))
E(f(k), u, f(c))
E(u, f(x), ff(c))
?
:
hhhhhhhhhhhhhhhhhhz
E(f(k), v,ff(c))
As you see, subsumption is about choice. No wonder it is NP-complete.
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
• Many thousands of subsumption tests need to be performed in a run
of a theorem prover.
• Most of them prove negative.
• They take about half of the running time of a prover.
• Complicated indexing techniques have been developed (discrimination
trees, code trees). What they index is the syntax of clauses. We want
to index clauses with respect to their meaning.
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
index
}}}
-������������*
hhhhhhhhhhhhhhhhhhhhhhz
:
PPPPPPPPPPPPPPPPPPPPPPq
������
������
������:
hhhhhhhhhhhhhhhhhhz
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
```````````````````̀
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
index
no}}}
-������������*
hhhhhhhhhhhhhhhhhhhhhhz
:
PPPPPPPPPPPPPPPPPPPPPPq
������
������
������:
hhhhhhhhhhhhhhhhhhz
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
```````````````````̀
PPPPPPPPPPPPPPPPPPPPq
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
index
no}maybe}}
-������������*
hhhhhhhhhhhhhhhhhhhhhhz
:
PPPPPPPPPPPPPPPPPPPPPPq
������
������
������:
hhhhhhhhhhhhhhhhhhz
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
```````````````````̀
PPPPPPPPPPPPPPPPPPPPq
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
index
no}maybe}}
-������������*
hhhhhhhhhhhhhhhhhhhhhhz
:
PPPPPPPPPPPPPPPPPPPPPPq
no
������
������
������:
hhhhhhhhhhhhhhhhhhz
Architecture of a theorem prover
set Aof clauses
�
�A new clause C
born from parents in A
?
Subsumption test
Does there exist D ∈ A
such that D subsumes C?
Yes No
? ?
```````````````````̀
PPPPPPPPPPPPPPPPPPPPq
�
C is redundant,its children would be redundant,so better if it dies now
90%?
-C is added to A
10%
This goes on untilthe empty clause is born...
index
no}maybe}}
-������������*
hhhhhhhhhhhhhhhhhhhhhhz
:
PPPPPPPPPPPPPPPPPPPPPPq
no
yes
������
������
������:
hhhhhhhhhhhhhhhhhhz
• Many thousands of subsumption tests need to be performed in a run
of a theorem prover.
• Most of them prove negative.
• They take about half of the running time of a prover.
• Complicated indexing techniques have been developed (discrimination
trees, code trees). What they index is the syntax of clauses. We want
to index clauses with respect to their meaning.
• Many thousands of subsumption tests need to be performed in a run
of a theorem prover.
• Most of them prove negative.
• They take about half of the running time of a prover.
• Complicated indexing techniques have been developed (discrimination
trees, code trees). What they index is the syntax of clauses. We want
to index clauses with respect to their meaning.
A semantic (no)-subsumption test
• take 64 small random structures M1,M2, . . .M64 over the signature ofinterest
• for each clause D ∈ A compute profile(D)=〈i1, . . . i64〉 where ik is thetruth value of D in Mi
• index clauses with respect to their profiles
Observation: Let: profile(D)=〈i1, . . . i64〉, profile(C)=〈j1, . . . j64〉.If ∀k ik ≤ jk then D maybe subsumes C. Otherwise does not .
It is very cheap test! Just check if 〈i1 ∧ j1, . . . i64 ∧ j64〉 6= 〈i1, . . . i64〉.
But is it any good? Can we expect decent selectivity?
A semantic (no)-implication test
• take 64 small random structures M1,M2, . . .M64 over the signature ofinterest
• for each clause D ∈ A compute profile(D)=〈i1, . . . i64〉 where ik is thetruth value of D in Mi
• index clauses with respect to their profiles
Observation: Let: profile(D)=〈i1, . . . i64〉, profile(C)=〈j1, . . . j64〉.If ∀k ik ≤ jk then D maybe implies C. Otherwise does not .
It is very cheap test! Just check if 〈i1 ∧ j1, . . . i64 ∧ j64〉 6= 〈i1, . . . i64〉.
But is it any good? Can we expect decent selectivity?
A semantic (no)-implication test
instances ofimplicationXXXXXXXXXXXXXXXXXXXXXXXz
instances ofsubsumption
``````
``````
`````̀
C1 step = {¬P (x), P (f(x))} C2 steps = {¬P (x), P (f(f(x)))}C1 step implies C2 steps but doesn’t subsume it.
Is this blue fragment a practical problem? Can we do anything about it?
A semantic (no)-implication test
-
〈C1 step, C2 steps〉
instances ofimplicationXXXXXXXXXXXXXXXXXXXXXXXz
instances ofsubsumption
``````
``````
`````̀
C1 step = {¬P (x), P (f(x))} C2 steps = {¬P (x), P (f(f(x)))}C1 step implies C2 steps but doesn’t subsume it.
Is this blue fragment a practical problem? Can we do anything about it?
A semantic (no)-implication test
-
〈C1 step, C2 steps〉
instances ofimplicationXXXXXXXXXXXXXXXXXXXXXXXz
instances ofsubsumption
``````
``````
`````̀
C1 step = {¬P (x), P (f(x))} C2 steps = {¬P (x), P (f(f(x)))}C1 step implies C2 steps but doesn’t subsume it.
Is this blue fragment a practical problem? Can we do anything about it?
We want to discover
some semantic meaning of subsumption.
Let us begin with:
Definition (of the truth value of clause C in structure M)....
Yes, I realize that you know what a truth value of a formula is.
We want to discover
some semantic meaning of subsumption.
Let us begin with:
Definition (of the truth value of clause C in structure M)....
Yes, I realize that you know what a truth value of a formula is.
We want to discover
some semantic meaning of subsumption.
Let us begin with:
Definition (of the truth value of clause C in structure M)....
Yes, I realize that you know what a truth value of a formula is.
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Let L be a logic.......i.e. a total order with a unary function L → L called
negation.
L-model is a set M with:
a function Mk →M for each k-ary function symbol
a function Mk → L for each k-ary relation symbol.
Take a valuation of variables: v : V ar →M
It extends to: v : terms →M
then to: v : atomic formulas → Land then to: v : literals → L
Truth value TV (C,M, v) of clause C in M under valuation v is...
max{v(L) | L ∈ C}
...the maximal truth value of a literal from C.
This max is what you used to call disjunction.
Truth value TV (C,M) of clause C in M is...
min{TV (C,M, v) | v : V ar →M}
...the minimal truth value of C in M under all possible valuations
(sometimes called universal quantification.)
C implies D in logic L if TV (C,M) ≤ TV (D,M) for each L-model M
Truth value TV (C,M, v) of clause C in M under valuation v is...
max{v(L) | L ∈ C}
...the maximal truth value of a literal from C.
This max is what you used to call disjunction.
Truth value TV (C,M) of clause C in M is...
min{TV (C,M, v) | v : V ar →M}
...the minimal truth value of C in M under all possible valuations
(sometimes called universal quantification.)
C implies D in logic L if TV (C,M) ≤ TV (D,M) for each L-model M
Truth value TV (C,M, v) of clause C in M under valuation v is...
max{v(L) | L ∈ C}
...the maximal truth value of a literal from C.
This max is what you used to call disjunction.
Truth value TV (C,M) of clause C in M is...
min{TV (C,M, v) | v : V ar →M}
...the minimal truth value of C in M under all possible valuations
(sometimes called universal quantification.)
C implies D in logic L if TV (C,M) ≤ TV (D,M) for each L-model M
Truth value TV (C,M, v) of clause C in M under valuation v is...
max{v(L) | L ∈ C}
...the maximal truth value of a literal from C.
This max is what you used to call disjunction.
Truth value TV (C,M) of clause C in M is...
min{TV (C,M, v) | v : V ar →M}
...the minimal truth value of C in M under all possible valuations
(sometimes called universal quantification.)
C implies D in logic L if TV (C,M) ≤ TV (D,M) for each L-model M
Truth value TV (C,M, v) of clause C in M under valuation v is...
max{v(L) | L ∈ C}
...the maximal truth value of a literal from C.
This max is what you used to call disjunction.
Truth value TV (C,M) of clause C in M is...
min{TV (C,M, v) | v : V ar →M}
...the minimal truth value of C in M under all possible valuations
(sometimes called universal quantification.)
C implies D in logic L if TV (C,M) ≤ TV (D,M) for each L-model M
Remark.
What you used to call implication is implication in logic 0 < 1, where
negation(x)=1-x
Observation.
If C subsumes D then C implies D in every logic.
Proof: The minimum ranges over more valuations in C than in D. The
maximum ranges over more literals in D than in C.
Remark.
What you used to call implication is implication in logic 0 < 1, where
negation(x)=1-x
Observation.
If C subsumes D then C implies D in every logic.
Proof: The minimum ranges over more valuations in C than in D. The
maximum ranges over more literals in D than in C.
Remark.
What you used to call implication is implication in logic 0 < 1, where
negation(x)=1-x
Observation.
If C subsumes D then C implies D in every logic.
Proof: The minimum ranges over more valuations in C than in D. The
maximum ranges over more literals in D than in C.
Remark.
What you used to call implication is implication in logic 0 < 1, where
negation(x)=1-x
Observation.
If C subsumes D then C implies D in every logic.
So whatever logic we take, our subsumption test will remain sound. Is
there any chance to make it complete (at least in principle)?
As some of you may still remember....
We want to discover
some semantic meaning of subsumption.
Consider the following Strange 4 Valued Logic:
XT>T>F>XF
negation(XT) = XT
negation(T) = F
negation(F) = T
negation(XF) = XF
Theorem. Subsumption of clauses is (finite) implication in S4VL.
Remark. In the practical cases subsumption is already implication in S3VL
– the value XF is not needed
As some of you may still remember....
We want to discover
some semantic meaning of subsumption.
Consider the following Strange 4 Valued Logic:
XT>T>F>XF
negation(XT) = XT
negation(T) = F
negation(F) = T
negation(XF) = XF
Theorem. Subsumption of clauses is (finite) implication in S4VL.
Remark. In the practical cases subsumption is already implication in S3VL
– the value XF is not needed
As some of you may still remember....
We want to discover
some semantic meaning of subsumption.
Consider the following Strange 4 Valued Logic:
XT>T>F>XF
negation(XT) = XT
negation(T) = F
negation(F) = T
negation(XF) = XF
Theorem. Subsumption of clauses is (finite) implication in S4VL.
Remark. In the practical cases subsumption is already implication in S3VL
– the value XF is not needed
As some of you may still remember....
We want to discover
some semantic meaning of subsumption.
Consider the following Strange 4 Valued Logic:
XT>T>F>XF
negation(XT) = XT
negation(T) = F
negation(F) = T
negation(XF) = XF
Theorem. Subsumption of clauses is (finite) implication in S4VL.
Remark. In the practical cases subsumption is already implication in S3VL
– the value XF is not needed.
Back to C1 step and C2 steps
Among the 4-element S3VL–models there are 10.7% such structures M
which witness no-subsumption i.e:
TV (C1 step,M) > TV (C2 steps,M)
Probability that for a random sequence M1,M2 . . .M64 of four-element
models TV (C1 step,Mi) ≤ TV (C2 steps,Mi) will hold for each i is:
0.89364 < 0.001
(this is the probability of a false positive in our test)
Among the 36 two-element S3VL models there are four such good struc-
tures M , which is 11.1%
Implementation
We implemented our ideas in Otter and compared the performance of:
• Otter
• our 2/2 semantic Otter (64 models consisting of 2 elements each, 2
truth values,negation as identity)
• our 2/3 semantic Otter (32 models of 2 elements each, S3VL)
• our 4/4 semantic Otter (20 models of 4 elements each, 4 truth values,
negation as identity)
Implementation issues: – profile indexing, – computing of the truth values.
Benchmarks
Reference Set – such problems from TPTP that:
either 2/2 semantic Otter or Otter finds a proof in less than 300 seconds;
at least one of them needs more than 5 seconds to find a proof.
Otter vs. 2/2 semantic Otter
(by TPTP domain, not all domains included, run on the Reference Set)
Th-s Th-s Th-s Th-s Th-s
not for for for not
proved which which which proved
by Otter they Otter by
TPTP 2/2 was are was Otter
domain sem. > 30% perform > 30%Otter faster equally slower
BOO 3 16 0 0 1
GRP 24 65 4 7 5
LCL 45 52 0 0 0
SYN 3 14 0 1 0
CAT 0 0 0 6 1
GEO 0 0 0 8 19
HWV 0 0 0 9 1
MGT 0 0 0 3 1
Theorems not proved by different versions of Otter
(by the maximal number of literals
in the input clauses, run on the Reference Set).
Maximal Th-s Th-s Th-s Th-s
number not not not not
of proved proved proved proved
literals by by by by
in 2/2 2/3 4/4 Otter
input semantic semantic semantic
clauses Otter Otter Otter
4 or less 77 101 105 5
5 3 8 8 1
6 0 1 0 1
7 0 1 0 1
8 0 6 6 11
9 1 1 1 1
10 to 19 0 7 2 6
FOF 0 2 2 5
Related idea: matching and unification
For a structure M and a term t ∈ TV (Σ) define the set of possible values
of t in M as
V al(t,M) = {τ̄(v) : τ : V →M}
Related idea: matching and unification
Observation. Suppose t is an instance of s, and M is any structure. Then
V al(t,M) ⊆ V al(s,M).
In other words, if we can guess a structure M in which V al(t,M) ⊆V al(s,M) does not hold, then t is not an instance of s.
But why should we bother? Matching and unification are easy anyway!
Why would we need a semantic test?
The above observation is true also for the AC case. AC matching, like
subsumption, is NP-complete and it turns out that it is AC matching that
takes most of the running time of EQP, a cousin of Otter built to prove
theorems in first-order equational logic.
Related idea: matching and unification
Observation. Suppose t is an instance of s, and M is any structure. Then
V al(t,M) ⊆ V al(s,M).
In other words, if we can guess a structure M in which V al(t,M) ⊆V al(s,M) does not hold, then t is not an instance of s.
But why should we bother? Matching and unification are easy anyway!
Why would we need a semantic test?
The above observation is true also for the AC case. AC matching, like
subsumption, is NP-complete and it turns out that it is AC matching that
takes most of the running time of EQP, a cousin of Otter built to prove
theorems in first-order equational logic.
Related idea: matching and unification
Observation. Suppose t is an instance of s, and M is any structure. Then
V al(t,M) ⊆ V al(s,M).
In other words, if we can guess a structure M in which V al(t,M) ⊆V al(s,M) does not hold, then t is not an instance of s.
But why should we bother? Matching and unification are easy anyway!
Why would we need a semantic test?
The above observation is true also for the AC case. AC matching, like
subsumption, is NP-complete and it turns out that it is AC matching that
takes most of the running time of EQP, a cousin of Otter built to prove
theorems in first-order equational logic.
We implemented the above idea in EQP. Terms are profiled by 32 random
models, each of them of size 4. We ran our semantic EQP on Robbins
Conjecture.
EQP vs. semantic EQP
on the lemmas of Robbins Conjecture
EQP semantic EQP
Lemma 1
total time 72.67 sec 36.74 sec
Lemma 1
AC matching time 56.13 sec 20.49 sec
Lemma 2
total time 25477.20 sec 11405.72 sec
Lemma 2
AC matching time 21030.06 sec 6812.41 sec
top related