from verbal argument structures to nominal ones: a data-mining approach olya gurevich 1 december...

From Verbal Argument Structures to Nominal Ones:

A Data-Mining Approach

Olya Gurevich

1 December 2010

© 2010 Microsoft

Talk Outline

Powerset: a natural language search engine (acquired by Microsoft in 2008)

Deverbal nouns and their arguments

Data collection and corpus-based modeling

Baseline system

Experiments

Conclusions

© 2010 Microsoft

Powerset: Natural Language Search

Queries and documents undergo syntactic and semantic parsing

Semantic representations allow both more constrained and more expansive matching compared to keywords► Who invaded Rome ≠ Who did Rome invade► Who did Rome invade ≈ Who was invaded by Rome► Who invaded Rome ≈ Who attacked Rome► Who invaded Rome ≈ Who was the invader of Rome

Worked on English-language Wikipedia

NL technology initially developed at Xerox PARC (XLE)

© 2010 Microsoft

Deverbal Nouns

Events often realized as nouns, not verbs► Armstrong’s return after his retirementArmstrong returned after he retired

► The destruction of Rome by the Huns was devastatingThe Huns destroyed Rome

► The Yankees’ defeat over the MetsThe Yankees defeated the Mets

► Kasparov’s defense of his knightKasparov defended his knight

In search, need to map deverbal expression to the verb (or vice versa)

© 2010 Microsoft

Deverbal Types

Eventive► destruction, return, death

Agent-like► Henri IV was the ruler of FranceHenri IV ruled France

Patient-like► Mary is an IBM employeeIBM employs Mary

© 2010 Microsoft

Deverbal Role Ambiguity

Deverbal syntax doesn’t always determine argument role► They jumped to the support of the Queen ==> They supported the Queen

► They enjoyed the support of the Queen ==> The Queen supported them

► We talked about the Merril Lynch acquisition

==> Was Merryl Lynch acquired? Or did it acquire something?

Particularly problematic if underlying verb is transitive but the deverbal noun has only one argument

© 2010 Microsoft

Baseline system

LFG-based syntactic parser (XLE)► Grammar is rule based► Disambiguation component statistically trained

List of ~4000 deverbals and corresponding verbs, from► WordNet derivational morphology► NomLex, NomLex Plus► Hand curation

Verb lexicon with subcategorization frames

© 2010 Microsoft

Baseline system cont.

Parse sentence using XLE

If a noun is in the list of ~4000 deverbals, map its arguments into those of a corresponding verb using rule-based heuristics. For transitive verbs:► X’s DV of Y ==> subj(V, X); obj(V,Y), etc.Obama’s support of reform => subj(support, Obama); obj(support, reform)

► X’s DV ==> subj(V, X) [default to most-frequent pattern]Obama’s support ==> subj(support, Obama)

► DV of X ==> obj(V,X) [default to most-frequent pattern]support of reform ==> obj(support, reform)

► X DV ==> no role► Subject-sharing support verbs: make, take

Goal: to improve over default assignments

© 2010 Microsoft

Baseline system cont.

Agent-like Deverbals► X’s DVer ==> obj(V,X)

■ the project’s director == subj(direct, director); obj(direct, project)

► DVer of X ==> obj(V,X)■ composer of the song == subj(compose; composer); obj(compose; song)

Patient-like Deverbals► X’s DVee ==> subj(V,X)

■ IBM’s employee == subj(employ, IBM); obj(employ, employee)

► DVee of X ==> subj(V,X)■ captive of the rebels == subj(capture, rebels); obj(capture, captive)

© 2010 Microsoft

Deverbal TaskGoal: predict relation between transitive V and argument X, given {X’s DV}, {DV of X}, or {X DV} ► the program’s renewal ==> obj(renew, program)► the king’s promise ==> subj(promise, king)

► the destruction of Rome ==> obj(destroy, Rome)► the rule of Henri IV ==> subj(rule, Henri IV)

► the Congress decision ==> subj(decide, Congress)

► castle protection ==> obj(protect, castle)► domain adaptation ==> ?(adapt, domain)

© 2010 Microsoft

Inference from verb usage

Large corpus data can indicate lexical preferences► Armstrong’s return == Armstrong returned► return of the book == the book was returned

If: X is more often a subject of V than object► then: X’s DV or DV of X ==> subj(V, X)

Need to count subj(V,X) | obj(V,X) occurrences for all possible pairs (V,X)

Need lots of parsed sentences!

© 2010 Microsoft

Data Sparseness

Where to get enough parsed data to count all occurrences to model any pair (V,X)?

We have parsed all of the English Wikipedia (2M docs, 121M sentences)

■ cf. Penn TreeBank (~50,000 sentences)

Oceanography: distributed architecture for fast extraction / analysis of huge parsed data sets

72M Role (Verb, Role, Arg) examples■ 69% of these appear just once, 13% just twice!!

Not enough data to make a good prediction for each individual argument

■ need to generalize across arguments

© 2010 Microsoft

Deverbal-only model

for each deverbal DV and related verb V

find corpus occurrences of overlapping arguments► X SUBJ V, X OBJ V, and X’s DV for all X

if (XSUBJ / XOBJ) > 1.5, consider X “subject preferring” for this DV

if DV has more subject-preferring than object-preferring arguments, then map:► X’s DV ==> subj(V,X) for all X

(conversely for object preference)

if the majority of overlapping arguments for a given V are neither subjects nor objects, DV is “other-preferring”

For each DV, average over all arguments X

© 2010 Microsoft

Walk-through example

renewal : renew► Argument: program

■ program’s renewal 2 occurrences■ obj(renew, program) 72 occurrences■ subj(renew, program) 9 occurrences■ {renewal, program} is object-preferring

► Argument: he■ his renewal 18 occurrences■ subj(renew, he) 615 occurrences■ obj(renew, he) 42 occurrences ■ {renewal, he} is subject-preferring

► Object-preferring arguments: 15► Subject-preferring arguments: 9► Overall preference for X’s renewal: obj► But is there a way to model non-majority preferences?

© 2010 Microsoft

Overall preferences

For possessive arguments, e.g. X’s renewal, X’s possession► Subj-preferring: 1786 deverbals (67%)► Obj-preferring: 884 (33%)► Default: subj

For of arguments, e.g. renewal of X, possession of X► Subj-preferring: 839 (29%)► Obj-preferring: 2036 (71%)► Default: obj

For prenominal arguments, e.g. X protection, X discovery► Subj-preferring: 373 (11%)► Obj-preferring: 1037 (31%)► Other-preferring: 1933 (58%)► Default: other (= no role)

© 2010 Microsoft

Incidence: subjects

Subject head (N=1000)

Deverbal

Verb

Subject error (N=220)

Agent

Patient

2-argument

"of"

poss

prenom

Other deverb

Verb

© 2010 Microsoft

Incidence: objects

Object head (N=1000)

Deverbal

Verb

Object error (N=260)

Agent

Patient

2-argument

"of"

poss

prenom

Other deverb

Verb

© 2010 Microsoft

Evaluation Data

X’s DV:► 1000 hand-annotated sentences► Possible judgments:

■ subj■ obj■ other

► Evaluate classification between subj and non-subj System

Subj Non-subj

Judged

Subj Correct Incorrect

Non-subj Incorrect Correct

© 2010 Microsoft

Evaluation

DV of X:► 750 hand-annotated sentences► Evaluate classification between obj and non-obj

System

Obj Non-obj

Judged

Obj Correct Incorrect

Non-obj Incorrect Correct

© 2010 Microsoft

Evaluation

DV X► 999 hand-annotated sentences► Evaluate classification between subj, obj, and none System

Subj Obj Other

Judged

Subj Correct Incorrect Incorrect

Obj Incorrect Correct Incorrect

Other Incorrect Incorrect Correct

© 2010 Microsoft

Evaluation measures

Error Incorrect

Correct Incorrect

© 2010 Microsoft

Deverbal-only Results: Possessives

Combining all arguments for each deverbal reduces role-labeling error by 39% for possessive argumentsPossessive arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Subj error Non-subjerror

Overall error

Baseline

Deverbal-onlymodel

© 2010 Microsoft

Deverbal-only Results: ‘of’ args

Error rate is reduced by 44%'of' arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Obj error Non-objerror

Overall error

Baseline

Deverbal-onlymodel

© 2010 Microsoft

Deverbal-only Results: prenominal args

Error rate is reduced by 28%

© 2010 Microsoft

Too much smoothing?

Combining all arguments is fairly drastic

Possible features of arguments that may impact behavior:► Ontological class: hard to get reliable classifications

► Animacy: subjects are more animate than objects (crosslinguistically true) the program’s renewal vs. his renewal

Possible features of deverbals and verbs that may impact behavior:► Ontological class► Active vs. passive use of verbs

© 2010 Microsoft

Animacy-based model

Split model into separate predictors for animate(X) and inanimate(X)► Animate: pronouns (I, you, he, she, they)► Inanimate: common nouns ► Ignored proper names due to poor classification into

people vs. places vs. organizations

If model does not have a prediction for the class of argument encountered, fall back to deverbal-only model

Results: ► more accurate subject labeling for animate arguments► lower recall and less accurate object labeling► overall error rate is about the same as deverbal-

only model► possibly due to insufficient training data

© 2010 Microsoft

Lexicalized model

Try to make predictions for individual DV+argument pairs

If the model has insufficient evidence for the pair, default to deverbal-only model

Results:► For possessive args, performance about the same as deverbal-only

► For ‘of’ args, performance slightly worse than deverbal-only

► For prenominal args, much worse performance► Model is vulnerable to data sparseness and systematic parsing errors (e.g. weather conditions)

© 2010 Microsoft

DV+animacy / lex results: possessives

Possessive arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Subj error Non-subj error Overall error

Baseline

Deverbal-only

Animacy

Lexicalized

© 2010 Microsoft

DV+animacy / lex results: ‘of’ arguments

'of' arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Obj error Non-obj error Overall error

Baseline

Deverbal-only

Animacy

Lexicalized

© 2010 Microsoft

DV+lex results: prenominal args

Prenominal arguments

0

0.2

0.4

0.6

0.8

1

1.2

Subj error Obj error Other error Overall error

Baseline

Deverbal-only

Lexicalized

© 2010 Microsoft

Training data size: 10K vs. 2M docs

00.10.20.30.40.50.60.70.80.9

Possessive "of"

Coverage vs. Size of training data

10K

2M

Error rate vs. Size of training data

0

0.1

0.2

0.3

0.4

0.5

Possessive of'

Baseline10K2M

© 2010 Microsoft

Support (“light”) verbs

Tried using the same method to derive support verbs► e.g. make a decision, take a walk, receive a gift

Look for patterns likeJohn decided vs. John made a decision => lv(decision, make, sb)We agreed vs. We had an agreement => lv(agreement, have, sb)

Initial predictions had quite a few spurious patterns

After manual curation► 96 DV-V pairs got a support verb► 25 unique support verbs► 28 support verb / argument patterns

Default model fairly fragile

Tight semantic relationship between light verbs and deverbals makes this method less applicable

© 2010 Microsoft

Directions for future work

Less ad hoc parameter setting

Further lexicalization of the model► Predictions for ontological classes of arguments

► Use properties of verbal constructions (e.g. passive vs. active, tense, etc.)

More fine-grained classification of non-subj/obj roles► director of 12 years► Bill Gates’ foundation► the Delhi Declaration

© 2010 Microsoft

Conclusions

Knowing how arguments typically participate in events allows interpretation of ambiguous deverbal syntax

Large parsed corpora are a valuable resource

Even the simplest models greatly reduce error

More data is better

© 2010 Microsoft

ReferencesM. Banko and E. Brill, Scaling to very very large corpora for natural language

disambiguation, ACL 2001.

S. Riezler, T. H. King, R. Kaplan, J. T. Maxwell. III, R. Crouch, and M. Johnson, Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques, ACL 2002.

S. A. Waterman, Distributed parse mining, in SETQA-NLP 2009.

O. Gurevich, R. Crouch, T. H. King, and V. de Paiva, Deverbal nouns in knowledge representation, Journal of Logic and Computation, vol. 18, pp. 385-404, 2008.

O. Gurevich, S.A. Waterman. Mapping Verbal Argument Preferences to Deverbal Nouns, IJSC 4(1), 2010

M. Nunes, Argument linking in English derived nominals, in Advances in Role and Reference Grammar, R. V. Valin, Ed. John Benjamins, 1993, pp. 375-432.

R. S. Crouch and T. H. King, Semantics via f-structure rewriting, LFG 2006.

C. Macleod, R. Grishman, A. Meyers, L. Barrett, and R. Reeves, NOMLEX: A lexicon of nominalizations, EURALEX 1998.

A. Meyers, R. Reeves, C. Macleod, R. Szekely, V. Zielinska, B. Young, and R. Grishman, The cross-breeding of dictionaries, LREC-2004.

C. Walker and H. Copperman, Evaluating complex semantic artifacts, LREC 2010.

S. Pradhan, H. Sun, W. Ward, J. H. Martin, and D. Jurafsky, Parsing arguments of nominalizations in English and Chinese, HLT-NAACL 2004.

C. Liu and H. T. Ng, Learning predictive structures for semantic role labeling of Nombank, ACL 2007.

M. Lapata, The disambiguation of nominalizations, Computational Linguistics, 28(3),357-388, 2002.

S. Pado, M. Pennacchiotti, and C. Sporleder, Semantic role assignment for event nominalisations by leveraging verbal data, CoLing 2008.

from verbal argument structures to nominal ones: a data-mining approach olya gurevich 1 december...

Documents