wide-coverage ccg parsing with quantifier scope
DESCRIPTION
TRANSCRIPT
IntroductionMethodology
Results
Wide-Coverage CCG Parsingwith Quantifier Scope
Dimitrios Kartsaklis
MSc Thesis, University of Edinburgh
Supervisor: Professor Mark Steedman
July 18, 2011
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 1/ 28
IntroductionMethodology
Results
Introduction
I A Natural Language Processing project
I Dealing with semantics, and specifically with quantifier scopeambiguities
I Purpose: The creation of a wide-coverage semantic parsercapable of handling quantifier scope ambiguities usingGeneralized Skolem Terms
I Grammar formalism: Combinatory Categorial Grammar(CCG)
I Logical form: First-order logic using λ-calculus as “glue”language
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 2/ 28
IntroductionMethodology
Results
Quantification
I All known human languages make use of quantification. InEnglish:• Universal quantifiers (∀): every, each, all, ...• Existential quantifiers (∃): a, some, ...• Generalized quantifiers: most, at least, few, ...
I Traditional representations using first-order logic andλ-calculus:• Universal: λp.λq.∀x [p(x) → q(x)]• Existential: λp.λq.∃x [p(x) ∧ q(x)]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 3/ 28
IntroductionMethodology
Results
Compositionality
I Frege’s principle:
“The meaning of the whole is a functionof the meaning of its parts”
I Example: “Every boy likes some girl”
Every boy likes some girl
NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.∃x[p(x) ∧ q(x)] : λx.girl(x)
> >NP NP
: λq.∀y [boy(y) → q(y)] : λq.∃x[girl(x) ∧ q(x)]>
S\NP : λy.∃x[girl(x) ∧ likes(y, x)]<
S : ∀y [boy(y) → ∃x[girl(x) ∧ likes(y, x)]]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 4/ 28
IntroductionMethodology
Results
Quantifier scope ambiguities
I Example: “Every boy likes some girl”• ∀x [boy(x) → ∃y [girl(y) ∧ likes(x , y)]]
(every boy likes a possibly different girl)
I However: Not the only meaning:• ∃y [girl(y) ∧ ∀x [boy(x) → likes(x , y)]]
(there is a specific girl who is liked by every boy)
I But our semantics is surface-compositional, so only the firstreading is allowed by syntax
I We need a quantification method to deliver both readings in asingle syntactic derivation
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 5/ 28
IntroductionMethodology
Results
Underspecification
I A solution to the problem: provide underspecifiedrepresentations of the quantified expressions without explicitlyspecify their scope:• 〈loves(x1, x2),
(λq.∀x [boy(x) → q(x)], 1),(λq.∃y [girl(y) ∧ q(y)], 2)〉
I Specification is performed in a separate step, after the end ofthe syntactic derivation, by combining the available quantifiedexpressions in every possible way
I The most known underspecification technique is Cooperstorage
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 6/ 28
IntroductionMethodology
Results
Underspecification problems
I Decoupling the semantic derivation from the syntacticcombinatorics can lead to problems:• Possibly equivalent logical forms:“Some boy likes some girl”
a. ∃x [boy(x) ∧ ∃y [girl(y) ∧ likes(x , y)]]
b. ∃y [girl(y) ∧ ∃x [boy(x) ∧ likes(x , y)]]
• Scope asymmetries: “Every boy likes, and every girldetests, some saxophonist”:
∀x [boy(x) → ∃y [sax(y) ∧ likes(x , y)]]∧∃v [sax(v) ∧ ∀z [girl(z) → detests(z , v)]]
• Intermediate readings: “Some teacher showed every pupilevery movie”:
∀x [movie(x) → ∃y [teacher(y)∧∀z [pupil(x) → showed(x , y , z)]]]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 7/ 28
IntroductionMethodology
Results
Skolemization
I If existentials cause such problems, why not remove themaltogether?
I Skolemization: The process of replacing an existentialquantifier with a function of all universally quantified variablesin whose scope the existential falls.
I Example 1: ∀x∃y∀z .P(x , y , x) =⇒ ∀x∀z .P(x , sk(x), z)• The existential of y is replaced by sk(x), since x was the only
preceding universal.
I Example 2: ∃y∀x∀z .P(x , y , x) =⇒ ∀x∀z .P(x , sk(), z)• Now sk() is a function without arguments – a constant.
I Example 3: “Every boy likes some girl”:• Normal form: ∀x [boy(x) → ∃y [girl(y) ∧ likes(x , y)]]
• Skolemized form: ∀x [boy(x) → likes(x , sk{x}girl )]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 8/ 28
IntroductionMethodology
Results
One step further: Generalized Skolem Terms
I The only true quantifiers in English are the universals every,each, and their relatives
I Every other non-universal quantifier should be associated witha (yet unspecified) Skolem term
I Skolem terms can be specified according to their environmentat any step of the derivation process into a generalized form
skEn:p;c
where E is the environment (preceding universals) and• n the number of the originating noun phrase• p a nominal property (e.g. “girl”)• c a cardinality condition (e.g. λs.|s| > 1)
(Mark Steedman, Natural Semantics of Scope, currently in publication by MIT Press)
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 9/ 28
IntroductionMethodology
Results
Two available readings
I Specification takes places in the beginning of the derivation
Every boy likes some girl
NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.q(skolem(p)) : λx.girl(x)
> >NP : λq.∀y [boy(y) → q(y)] NP : λq.q(skolem(λx.girl(x)))
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NP : λq.q(sk{}girl
)>
S\NP : λy.likes(y, sk{}girl
)<
S : ∀y [boy(y) → likes(y, sk{}girl
)]
I Specification takes places at the end of the derivation
Every boy likes some girl
NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.q(skolem(p)) : λx.girl(x)
> >NP : λq.∀y [boy(y) → q(y)] NP : λq.q(skolem(λx.girl(x)))
>S\NP : λy.likes(y, skolem(λx.girl(x)))
<S : ∀y [boy(y) → likes(y, skolem(λx.girl(x)))]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S : ∀y [boy(y) → likes(y, sk{y}girl
)]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 10/ 28
IntroductionMethodology
Results
Advantages
I Provides a global solution to the most importantquantification problems
I Skolem terms are part of the semantic theory, not ad-hocmechanisms
I Easy integration with CCG parsers – no significant increase incomputational complexity
I Semantic derivation is performed “on-line”, based on thecombinatory rules of CCG• This limits the degree of freedom in which the available
readings are derived, so non-attested or redundant readings areexcluded
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 11/ 28
IntroductionMethodology
Results
A proof of concept
I Purpose of the project: The creation of a wide-coveragesemantic parser that applies the previously describedquantification approach.
I Main tasks:1. Create a wide-coverage probabilistic syntactic parser2. Create a λ-calculus framework for the logical forms3. Integrate the semantic combinatorics to the parser4. Provide appropriate logical forms for the CCG lexicon
I Eventually: Provide a proof of concept for the theory bytesting the results in specific quantification cases
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 12/ 28
IntroductionMethodology
Results
Syntactic parsing
I Parser is based on the OpenCCG framework• Well-tested API for parsing, supports every aspect of CCG
I Two additions:• A supertagger for assigning initial categories to the words
(Clark & Curran)• A probabilistic model incorporating head-word dependencies
(Hockenmaier)
I Standard interpolation techniques for dealing with sparse dataproblems
I Beam search for pruning the search space
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 13/ 28
IntroductionMethodology
Results
Probabilistic model
I A generative model on local trees level (trees of depth 1)(Hockenmaier)
I Baseline version: The probability of a local tree with root Nand children H and S is the product of:• An expansion probability P(expansion|N)• A head probability P(H|N, expansion)• A non-head probability P(S |N, expansion,H)• A lexical probability P(w |N, expansion = leaf )
I The overall probability of a derivation is the product of theprobabilities of all local trees
I Head-word dependencies version: Also take in account therelationships between the heads of local trees
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 14/ 28
IntroductionMethodology
Results
Semantic forms
I An object-oriented approachI Formulas are represented as a set of nested objects
• Example: ‘‘Every man walks”Infix notation: ∀x [man(x) → walks(x)]Prefix notation: all(x,imp(man(x),walks(x)))
xx
xx
all(x, expr)
imp(expr1, expr2)
man(x)
xx
walk(x)
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 15/ 28
IntroductionMethodology
Results
Object hierarchy
Expression
LambdaAbstraction FunctionalApplication FirstOrderExpression Variable SkolemTerm
Quantification Conjunction ...
BindingTerm PlainVariable
Lambda Quantified
GeneralizedST
GenericPredicate
Constant
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 16/ 28
IntroductionMethodology
Results
Skolem Term representations
I A single linked packed structure:
GeneralizedSkolemTerm(Specification1)
SkolemTerm
GeneralizedSkolemTerm(Specification2)
GeneralizedSkolemTerm(Specification3)
Object references (pointers)to specifications
I Example: “A boy ate a pizza”
ate(
{skolem′
sk ′
}boy ′,
{skolem′
sk ′
}pizza′)
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 17/ 28
IntroductionMethodology
Results
β-conversion
I β-conversion: The process of substituting a bound variable inthe body of a λ-abstraction by the argument passed to thefunction
I A stack-based method (Blackburn & Bos)
1. When the expression is an application, push its argument tothe stack and discard the outermost application object.
2. If the expression is a λ-abstraction, throw away the λ-term,pop the item at the top of the stack, and substitute it forevery occurrence of the correlated variable.
3. If the expression is neither an application nor a λ-abstraction,β-convert its sub-expressions.
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 18/ 28
IntroductionMethodology
Results
β-conversion
I Example: “John loves Mary”
John loves Mary
NP : john (S\NP)/NP : λx .λy .loves(y , x) NP : mary>
S\NP : λy .loves(y ,mary)<
S : loves(john,mary)
Expression Stack
1 app(app(lam(x,lam(y,loves(y,x))),mary),john) []
2 app(lam(x,lam(y,loves(y,x))),mary) [john]
3 lam(x,lam(y,loves(y,x))) [mary,john]
4 lam(y,loves(y,mary)) [john]
5 loves(john,mary) []
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 19/ 28
IntroductionMethodology
Results
OpenCCG and CKY
I OpenCCG uses the CKY algorithm:
Mary
married
John
NP
NP
(S\NP)/NP S\NP
S/(S\NP)
S/(S\NP)
S/NP
S
S
Mary married John
NP (S\NP)/NP NP>
S\NP<
S
Mary married John
NP (S\NP)/NP NP>T
S/(S\NP)>B
S/NP>
S
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 20/ 28
IntroductionMethodology
Results
CKY modifications
I An additional step is introduced in the inner loop of the CKYalgorithm, called skolem term specification:
1. For each skolem term ST in the logical form Λ, collect thenew environment (preceding universals) of ST .
2. If the new environment is different than the old environment,specify a new Generalized Skolem Term and add it to thespecifications list of ST .
where ΛA is the logical form of a result category A that hasbeen produced by the application of some CCG rule
I Environment is always readily available thanks to the nestedinternal structure
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 21/ 28
IntroductionMethodology
Results
Syntax-to-semantics mapping
I The syntactic rules are mapped to semantic transformationsbased on the following table:
Rule λ-abstraction
fapp(ΛB ,ΛC ) ΛA = app(ΛB ,ΛC )bapp(ΛB ,ΛC ) ΛA = app(ΛC ,ΛB)fcomp(ΛB ,ΛC ) ΛA = λx̄ .app(ΛB , app(x̄ ,ΛC ))bcomp(ΛB ,ΛC ) ΛA = λx̄ .app(ΛC , app(ΛB , x̄))
I λx̄ is a vector containing the outer λ-terms of the predicatethat remain to be filled after the composition
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 22/ 28
IntroductionMethodology
Results
The semantic lexicon
I A simple form that allows various degrees of grouping betweencategories and words
I Each entry is comprised by a descriptive title, a list of CCGcategories, a list of surface forms, and a logical expression inprefix notation
I Example: The entry for universal quantifiers
[universal]
categories: (S/(S\NP))/N|NP/Nwords: every|each|all
LF: lam(p,lam(q,all(x,impl(app(p,x),app(q,x)))))
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 23/ 28
IntroductionMethodology
Results
Syntactic parsing results
I Probabilistic model trained on Sections 02-21 of CCGbank
I Evaluation has been performed on Section 23 of CCGbank
Parser Cov. LexCat 〈P,H,S〉 〈〉Clark et al. (2002) 95.0 90.3 81.8 90.0Hockenmaier (2003) 99.8 92.2 85.1 91.4Clark & Curran (2004) 99.6 93.6 86.4 92.3
96.6 92.4 71.8 78.8
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 24/ 28
IntroductionMethodology
Results
Evaluation of semantic component
I The semantic aspect of the parser was tested on 50 sentencespresenting a wide range of linguistic challenges
I More specifically, the following cases were tested:• Scope inversion• “Donkey” sentences• Scope asymmetries• Intermediate scope• Spurius readings• Coordination cases• Generalized quantifiers
I In almost every case the results conformed to the predictionsof the theory
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 25/ 28
IntroductionMethodology
Results
Form of the derivations
I Sample derivation: “Some logician proved every theorem”
• ∀x [theorem(x) → proved(sk{}{x}logician, x)]
(lex) Some :- NP/N : lam:p.lam:q.q(skolem(p))
(lex) logician :- N : lam:x.logician(x)
(>) Some logician :- NP : lam:q.q(sk{lam:x.logician(x)}_{})
(lex) proved :- (S\NP)/NP : lam:x.lam:y.proved(y,x)
(lex) every :- NP/N : lam:p.lam:q.all:x[p(x)->q(x)]
(lex) theorem :- N : lam:x.theorem(x)
(>) every theorem :- NP : lam:q.all:x[theorem(x)->q(x)]
(>) proved every theorem :- S\NP : lam:y.all:x[theorem(x)->proved(y,x)]
(gram) type-changing3: S\NP => NP\NP
(tchange3) proved every theorem :- NP\NP : lam:y.all:x[theorem(x)->proved(y,x)]
(<) Some logician proved every theorem :- NP :
all:x[theorem(x)->proved(sk{lam:x.logician(x)}_{}_{x},x)]
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 26/ 28
IntroductionMethodology
Results
However...
I Probabilistic model too weak to properly guide the semanticderivation in many cases
I Wide-coverage parsers stretch the flexibility of the grammar inorder to provide some sort of analysis, even a wrong one
I Example: “Every man walks and talks”
Every man walks and talks
NP (S\NP)/NP conj ∗N>
NN ⇒ NP
>S\NP
<S
I In such cases, proper semantic derivation is blocked –semantics simply cannot follow syntax
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 27/ 28
IntroductionMethodology
Results
Future work
I Fine-tuning of the probabilistic model
I Extending the semantic lexicon for really wide-coveragesemantic parsing
I Adding semantic aspects such as negation and polarities
I Improve coverage of generalized quantifiers
I Presenting the results in a less cryptic form, by properlyunpacking and enumerate all the available readings
Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 28/ 28