wide-coverage ccg parsing with quantifier scope

28
Introduction Methodology Results Wide-Coverage CCG Parsing with Quantifier Scope Dimitrios Kartsaklis MSc Thesis, University of Edinburgh Supervisor: Professor Mark Steedman July 18, 2011 Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 1/ 28

Upload: dimkart

Post on 18-Dec-2014

2.656 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Wide-Coverage CCG Parsingwith Quantifier Scope

Dimitrios Kartsaklis

MSc Thesis, University of Edinburgh

Supervisor: Professor Mark Steedman

July 18, 2011

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 1/ 28

Page 2: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Introduction

I A Natural Language Processing project

I Dealing with semantics, and specifically with quantifier scopeambiguities

I Purpose: The creation of a wide-coverage semantic parsercapable of handling quantifier scope ambiguities usingGeneralized Skolem Terms

I Grammar formalism: Combinatory Categorial Grammar(CCG)

I Logical form: First-order logic using λ-calculus as “glue”language

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 2/ 28

Page 3: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Quantification

I All known human languages make use of quantification. InEnglish:• Universal quantifiers (∀): every, each, all, ...• Existential quantifiers (∃): a, some, ...• Generalized quantifiers: most, at least, few, ...

I Traditional representations using first-order logic andλ-calculus:• Universal: λp.λq.∀x [p(x) → q(x)]• Existential: λp.λq.∃x [p(x) ∧ q(x)]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 3/ 28

Page 4: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Compositionality

I Frege’s principle:

“The meaning of the whole is a functionof the meaning of its parts”

I Example: “Every boy likes some girl”

Every boy likes some girl

NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.∃x[p(x) ∧ q(x)] : λx.girl(x)

> >NP NP

: λq.∀y [boy(y) → q(y)] : λq.∃x[girl(x) ∧ q(x)]>

S\NP : λy.∃x[girl(x) ∧ likes(y, x)]<

S : ∀y [boy(y) → ∃x[girl(x) ∧ likes(y, x)]]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 4/ 28

Page 5: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Quantifier scope ambiguities

I Example: “Every boy likes some girl”• ∀x [boy(x) → ∃y [girl(y) ∧ likes(x , y)]]

(every boy likes a possibly different girl)

I However: Not the only meaning:• ∃y [girl(y) ∧ ∀x [boy(x) → likes(x , y)]]

(there is a specific girl who is liked by every boy)

I But our semantics is surface-compositional, so only the firstreading is allowed by syntax

I We need a quantification method to deliver both readings in asingle syntactic derivation

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 5/ 28

Page 6: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Underspecification

I A solution to the problem: provide underspecifiedrepresentations of the quantified expressions without explicitlyspecify their scope:• 〈loves(x1, x2),

(λq.∀x [boy(x) → q(x)], 1),(λq.∃y [girl(y) ∧ q(y)], 2)〉

I Specification is performed in a separate step, after the end ofthe syntactic derivation, by combining the available quantifiedexpressions in every possible way

I The most known underspecification technique is Cooperstorage

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 6/ 28

Page 7: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Underspecification problems

I Decoupling the semantic derivation from the syntacticcombinatorics can lead to problems:• Possibly equivalent logical forms:“Some boy likes some girl”

a. ∃x [boy(x) ∧ ∃y [girl(y) ∧ likes(x , y)]]

b. ∃y [girl(y) ∧ ∃x [boy(x) ∧ likes(x , y)]]

• Scope asymmetries: “Every boy likes, and every girldetests, some saxophonist”:

∀x [boy(x) → ∃y [sax(y) ∧ likes(x , y)]]∧∃v [sax(v) ∧ ∀z [girl(z) → detests(z , v)]]

• Intermediate readings: “Some teacher showed every pupilevery movie”:

∀x [movie(x) → ∃y [teacher(y)∧∀z [pupil(x) → showed(x , y , z)]]]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 7/ 28

Page 8: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Skolemization

I If existentials cause such problems, why not remove themaltogether?

I Skolemization: The process of replacing an existentialquantifier with a function of all universally quantified variablesin whose scope the existential falls.

I Example 1: ∀x∃y∀z .P(x , y , x) =⇒ ∀x∀z .P(x , sk(x), z)• The existential of y is replaced by sk(x), since x was the only

preceding universal.

I Example 2: ∃y∀x∀z .P(x , y , x) =⇒ ∀x∀z .P(x , sk(), z)• Now sk() is a function without arguments – a constant.

I Example 3: “Every boy likes some girl”:• Normal form: ∀x [boy(x) → ∃y [girl(y) ∧ likes(x , y)]]

• Skolemized form: ∀x [boy(x) → likes(x , sk{x}girl )]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 8/ 28

Page 9: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

One step further: Generalized Skolem Terms

I The only true quantifiers in English are the universals every,each, and their relatives

I Every other non-universal quantifier should be associated witha (yet unspecified) Skolem term

I Skolem terms can be specified according to their environmentat any step of the derivation process into a generalized form

skEn:p;c

where E is the environment (preceding universals) and• n the number of the originating noun phrase• p a nominal property (e.g. “girl”)• c a cardinality condition (e.g. λs.|s| > 1)

(Mark Steedman, Natural Semantics of Scope, currently in publication by MIT Press)

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 9/ 28

Page 10: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Two available readings

I Specification takes places in the beginning of the derivation

Every boy likes some girl

NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.q(skolem(p)) : λx.girl(x)

> >NP : λq.∀y [boy(y) → q(y)] NP : λq.q(skolem(λx.girl(x)))

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

NP : λq.q(sk{}girl

)>

S\NP : λy.likes(y, sk{}girl

)<

S : ∀y [boy(y) → likes(y, sk{}girl

)]

I Specification takes places at the end of the derivation

Every boy likes some girl

NP/N N (S\NP)/NP NP/N N: λp.λq.∀y [p(y) → q(y)] : λy.boy(y) : λx.λy.likes(y, x) : λp.λq.q(skolem(p)) : λx.girl(x)

> >NP : λq.∀y [boy(y) → q(y)] NP : λq.q(skolem(λx.girl(x)))

>S\NP : λy.likes(y, skolem(λx.girl(x)))

<S : ∀y [boy(y) → likes(y, skolem(λx.girl(x)))]

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

S : ∀y [boy(y) → likes(y, sk{y}girl

)]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 10/ 28

Page 11: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Advantages

I Provides a global solution to the most importantquantification problems

I Skolem terms are part of the semantic theory, not ad-hocmechanisms

I Easy integration with CCG parsers – no significant increase incomputational complexity

I Semantic derivation is performed “on-line”, based on thecombinatory rules of CCG• This limits the degree of freedom in which the available

readings are derived, so non-attested or redundant readings areexcluded

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 11/ 28

Page 12: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

A proof of concept

I Purpose of the project: The creation of a wide-coveragesemantic parser that applies the previously describedquantification approach.

I Main tasks:1. Create a wide-coverage probabilistic syntactic parser2. Create a λ-calculus framework for the logical forms3. Integrate the semantic combinatorics to the parser4. Provide appropriate logical forms for the CCG lexicon

I Eventually: Provide a proof of concept for the theory bytesting the results in specific quantification cases

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 12/ 28

Page 13: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Syntactic parsing

I Parser is based on the OpenCCG framework• Well-tested API for parsing, supports every aspect of CCG

I Two additions:• A supertagger for assigning initial categories to the words

(Clark & Curran)• A probabilistic model incorporating head-word dependencies

(Hockenmaier)

I Standard interpolation techniques for dealing with sparse dataproblems

I Beam search for pruning the search space

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 13/ 28

Page 14: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Probabilistic model

I A generative model on local trees level (trees of depth 1)(Hockenmaier)

I Baseline version: The probability of a local tree with root Nand children H and S is the product of:• An expansion probability P(expansion|N)• A head probability P(H|N, expansion)• A non-head probability P(S |N, expansion,H)• A lexical probability P(w |N, expansion = leaf )

I The overall probability of a derivation is the product of theprobabilities of all local trees

I Head-word dependencies version: Also take in account therelationships between the heads of local trees

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 14/ 28

Page 15: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Semantic forms

I An object-oriented approachI Formulas are represented as a set of nested objects

• Example: ‘‘Every man walks”Infix notation: ∀x [man(x) → walks(x)]Prefix notation: all(x,imp(man(x),walks(x)))

xx

xx

all(x, expr)

imp(expr1, expr2)

man(x)

xx

walk(x)

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 15/ 28

Page 16: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Object hierarchy

Expression

LambdaAbstraction FunctionalApplication FirstOrderExpression Variable SkolemTerm

Quantification Conjunction ...

BindingTerm PlainVariable

Lambda Quantified

GeneralizedST

GenericPredicate

Constant

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 16/ 28

Page 17: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Skolem Term representations

I A single linked packed structure:

GeneralizedSkolemTerm(Specification1)

SkolemTerm

GeneralizedSkolemTerm(Specification2)

GeneralizedSkolemTerm(Specification3)

Object references (pointers)to specifications

I Example: “A boy ate a pizza”

ate(

{skolem′

sk ′

}boy ′,

{skolem′

sk ′

}pizza′)

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 17/ 28

Page 18: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

β-conversion

I β-conversion: The process of substituting a bound variable inthe body of a λ-abstraction by the argument passed to thefunction

I A stack-based method (Blackburn & Bos)

1. When the expression is an application, push its argument tothe stack and discard the outermost application object.

2. If the expression is a λ-abstraction, throw away the λ-term,pop the item at the top of the stack, and substitute it forevery occurrence of the correlated variable.

3. If the expression is neither an application nor a λ-abstraction,β-convert its sub-expressions.

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 18/ 28

Page 19: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

β-conversion

I Example: “John loves Mary”

John loves Mary

NP : john (S\NP)/NP : λx .λy .loves(y , x) NP : mary>

S\NP : λy .loves(y ,mary)<

S : loves(john,mary)

Expression Stack

1 app(app(lam(x,lam(y,loves(y,x))),mary),john) []

2 app(lam(x,lam(y,loves(y,x))),mary) [john]

3 lam(x,lam(y,loves(y,x))) [mary,john]

4 lam(y,loves(y,mary)) [john]

5 loves(john,mary) []

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 19/ 28

Page 20: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

OpenCCG and CKY

I OpenCCG uses the CKY algorithm:

Mary

married

John

NP

NP

(S\NP)/NP S\NP

S/(S\NP)

S/(S\NP)

S/NP

S

S

Mary married John

NP (S\NP)/NP NP>

S\NP<

S

Mary married John

NP (S\NP)/NP NP>T

S/(S\NP)>B

S/NP>

S

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 20/ 28

Page 21: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

CKY modifications

I An additional step is introduced in the inner loop of the CKYalgorithm, called skolem term specification:

1. For each skolem term ST in the logical form Λ, collect thenew environment (preceding universals) of ST .

2. If the new environment is different than the old environment,specify a new Generalized Skolem Term and add it to thespecifications list of ST .

where ΛA is the logical form of a result category A that hasbeen produced by the application of some CCG rule

I Environment is always readily available thanks to the nestedinternal structure

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 21/ 28

Page 22: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Syntax-to-semantics mapping

I The syntactic rules are mapped to semantic transformationsbased on the following table:

Rule λ-abstraction

fapp(ΛB ,ΛC ) ΛA = app(ΛB ,ΛC )bapp(ΛB ,ΛC ) ΛA = app(ΛC ,ΛB)fcomp(ΛB ,ΛC ) ΛA = λx̄ .app(ΛB , app(x̄ ,ΛC ))bcomp(ΛB ,ΛC ) ΛA = λx̄ .app(ΛC , app(ΛB , x̄))

I λx̄ is a vector containing the outer λ-terms of the predicatethat remain to be filled after the composition

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 22/ 28

Page 23: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

The semantic lexicon

I A simple form that allows various degrees of grouping betweencategories and words

I Each entry is comprised by a descriptive title, a list of CCGcategories, a list of surface forms, and a logical expression inprefix notation

I Example: The entry for universal quantifiers

[universal]

categories: (S/(S\NP))/N|NP/Nwords: every|each|all

LF: lam(p,lam(q,all(x,impl(app(p,x),app(q,x)))))

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 23/ 28

Page 24: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Syntactic parsing results

I Probabilistic model trained on Sections 02-21 of CCGbank

I Evaluation has been performed on Section 23 of CCGbank

Parser Cov. LexCat 〈P,H,S〉 〈〉Clark et al. (2002) 95.0 90.3 81.8 90.0Hockenmaier (2003) 99.8 92.2 85.1 91.4Clark & Curran (2004) 99.6 93.6 86.4 92.3

96.6 92.4 71.8 78.8

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 24/ 28

Page 25: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Evaluation of semantic component

I The semantic aspect of the parser was tested on 50 sentencespresenting a wide range of linguistic challenges

I More specifically, the following cases were tested:• Scope inversion• “Donkey” sentences• Scope asymmetries• Intermediate scope• Spurius readings• Coordination cases• Generalized quantifiers

I In almost every case the results conformed to the predictionsof the theory

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 25/ 28

Page 26: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Form of the derivations

I Sample derivation: “Some logician proved every theorem”

• ∀x [theorem(x) → proved(sk{}{x}logician, x)]

(lex) Some :- NP/N : lam:p.lam:q.q(skolem(p))

(lex) logician :- N : lam:x.logician(x)

(>) Some logician :- NP : lam:q.q(sk{lam:x.logician(x)}_{})

(lex) proved :- (S\NP)/NP : lam:x.lam:y.proved(y,x)

(lex) every :- NP/N : lam:p.lam:q.all:x[p(x)->q(x)]

(lex) theorem :- N : lam:x.theorem(x)

(>) every theorem :- NP : lam:q.all:x[theorem(x)->q(x)]

(>) proved every theorem :- S\NP : lam:y.all:x[theorem(x)->proved(y,x)]

(gram) type-changing3: S\NP => NP\NP

(tchange3) proved every theorem :- NP\NP : lam:y.all:x[theorem(x)->proved(y,x)]

(<) Some logician proved every theorem :- NP :

all:x[theorem(x)->proved(sk{lam:x.logician(x)}_{}_{x},x)]

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 26/ 28

Page 27: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

However...

I Probabilistic model too weak to properly guide the semanticderivation in many cases

I Wide-coverage parsers stretch the flexibility of the grammar inorder to provide some sort of analysis, even a wrong one

I Example: “Every man walks and talks”

Every man walks and talks

NP (S\NP)/NP conj ∗N>

NN ⇒ NP

>S\NP

<S

I In such cases, proper semantic derivation is blocked –semantics simply cannot follow syntax

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 27/ 28

Page 28: Wide-Coverage CCG Parsing with Quantifier Scope

IntroductionMethodology

Results

Future work

I Fine-tuning of the probabilistic model

I Extending the semantic lexicon for really wide-coveragesemantic parsing

I Adding semantic aspects such as negation and polarities

I Improve coverage of generalized quantifiers

I Presenting the results in a less cryptic form, by properlyunpacking and enumerate all the available readings

Dimitrios Kartsaklis Wide-Coverage CCG Parsing with Quantifier Scope 28/ 28