from verbal argument structures to nominal ones: a data-mining approach olya gurevich 1 december...

36
From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

From Verbal Argument Structures to Nominal Ones:

A Data-Mining Approach

Olya Gurevich

1 December 2010

Page 2: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 2

Talk Outline

Powerset: a natural language search engine (acquired by Microsoft in 2008)

Deverbal nouns and their arguments

Data collection and corpus-based modeling

Baseline system

Experiments

Conclusions

Page 3: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 3

Powerset: Natural Language Search

Queries and documents undergo syntactic and semantic parsing

Semantic representations allow both more constrained and more expansive matching compared to keywords► Who invaded Rome ≠ Who did Rome invade► Who did Rome invade ≈ Who was invaded by Rome► Who invaded Rome ≈ Who attacked Rome► Who invaded Rome ≈ Who was the invader of Rome

Worked on English-language Wikipedia

NL technology initially developed at Xerox PARC (XLE)

Page 4: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 4

Deverbal Nouns

Events often realized as nouns, not verbs► Armstrong’s return after his retirementArmstrong returned after he retired

► The destruction of Rome by the Huns was devastatingThe Huns destroyed Rome

► The Yankees’ defeat over the MetsThe Yankees defeated the Mets

► Kasparov’s defense of his knightKasparov defended his knight

In search, need to map deverbal expression to the verb (or vice versa)

Page 5: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 5

Deverbal Types

Eventive► destruction, return, death

Agent-like► Henri IV was the ruler of FranceHenri IV ruled France

Patient-like► Mary is an IBM employeeIBM employs Mary

Page 6: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 6

Deverbal Role Ambiguity

Deverbal syntax doesn’t always determine argument role► They jumped to the support of the Queen ==> They supported the Queen

► They enjoyed the support of the Queen ==> The Queen supported them

► We talked about the Merril Lynch acquisition

==> Was Merryl Lynch acquired? Or did it acquire something?

Particularly problematic if underlying verb is transitive but the deverbal noun has only one argument

Page 7: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 7

Baseline system

LFG-based syntactic parser (XLE)► Grammar is rule based► Disambiguation component statistically trained

List of ~4000 deverbals and corresponding verbs, from► WordNet derivational morphology► NomLex, NomLex Plus► Hand curation

Verb lexicon with subcategorization frames

Page 8: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 8

Baseline system cont.

Parse sentence using XLE

If a noun is in the list of ~4000 deverbals, map its arguments into those of a corresponding verb using rule-based heuristics. For transitive verbs:► X’s DV of Y ==> subj(V, X); obj(V,Y), etc.Obama’s support of reform => subj(support, Obama); obj(support, reform)

► X’s DV ==> subj(V, X) [default to most-frequent pattern]Obama’s support ==> subj(support, Obama)

► DV of X ==> obj(V,X) [default to most-frequent pattern]support of reform ==> obj(support, reform)

► X DV ==> no role► Subject-sharing support verbs: make, take

Goal: to improve over default assignments

Page 9: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 9

Baseline system cont.

Agent-like Deverbals► X’s DVer ==> obj(V,X)

■ the project’s director == subj(direct, director); obj(direct, project)

► DVer of X ==> obj(V,X)■ composer of the song == subj(compose; composer); obj(compose; song)

Patient-like Deverbals► X’s DVee ==> subj(V,X)

■ IBM’s employee == subj(employ, IBM); obj(employ, employee)

► DVee of X ==> subj(V,X)■ captive of the rebels == subj(capture, rebels); obj(capture, captive)

Page 10: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 10

Deverbal TaskGoal: predict relation between transitive V and argument X, given {X’s DV}, {DV of X}, or {X DV} ► the program’s renewal ==> obj(renew, program)► the king’s promise ==> subj(promise, king)

► the destruction of Rome ==> obj(destroy, Rome)► the rule of Henri IV ==> subj(rule, Henri IV)

► the Congress decision ==> subj(decide, Congress)

► castle protection ==> obj(protect, castle)► domain adaptation ==> ?(adapt, domain)

Page 11: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 11

Inference from verb usage

Large corpus data can indicate lexical preferences► Armstrong’s return == Armstrong returned► return of the book == the book was returned

If: X is more often a subject of V than object► then: X’s DV or DV of X ==> subj(V, X)

Need to count subj(V,X) | obj(V,X) occurrences for all possible pairs (V,X)

Need lots of parsed sentences!

Page 12: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 12

Data Sparseness

Where to get enough parsed data to count all occurrences to model any pair (V,X)?

We have parsed all of the English Wikipedia (2M docs, 121M sentences)

■ cf. Penn TreeBank (~50,000 sentences)

Oceanography: distributed architecture for fast extraction / analysis of huge parsed data sets

72M Role (Verb, Role, Arg) examples■ 69% of these appear just once, 13% just twice!!

Not enough data to make a good prediction for each individual argument

■ need to generalize across arguments

Page 13: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 13

Deverbal-only model

for each deverbal DV and related verb V

find corpus occurrences of overlapping arguments► X SUBJ V, X OBJ V, and X’s DV for all X

if (XSUBJ / XOBJ) > 1.5, consider X “subject preferring” for this DV

if DV has more subject-preferring than object-preferring arguments, then map:► X’s DV ==> subj(V,X) for all X

(conversely for object preference)

if the majority of overlapping arguments for a given V are neither subjects nor objects, DV is “other-preferring”

For each DV, average over all arguments X

Page 14: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 14

Walk-through example

renewal : renew► Argument: program

■ program’s renewal 2 occurrences■ obj(renew, program) 72 occurrences■ subj(renew, program) 9 occurrences■ {renewal, program} is object-preferring

► Argument: he■ his renewal 18 occurrences■ subj(renew, he) 615 occurrences■ obj(renew, he) 42 occurrences ■ {renewal, he} is subject-preferring

► Object-preferring arguments: 15► Subject-preferring arguments: 9► Overall preference for X’s renewal: obj► But is there a way to model non-majority preferences?

Page 15: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 15

Overall preferences

For possessive arguments, e.g. X’s renewal, X’s possession► Subj-preferring: 1786 deverbals (67%)► Obj-preferring: 884 (33%)► Default: subj

For of arguments, e.g. renewal of X, possession of X► Subj-preferring: 839 (29%)► Obj-preferring: 2036 (71%)► Default: obj

For prenominal arguments, e.g. X protection, X discovery► Subj-preferring: 373 (11%)► Obj-preferring: 1037 (31%)► Other-preferring: 1933 (58%)► Default: other (= no role)

Page 16: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 16

Incidence: subjects

Subject head (N=1000)

Deverbal

Verb

Subject error (N=220)

Agent

Patient

2-argument

"of"

poss

prenom

Other deverb

Verb

Page 17: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 17

Incidence: objects

Object head (N=1000)

Deverbal

Verb

Object error (N=260)

Agent

Patient

2-argument

"of"

poss

prenom

Other deverb

Verb

Page 18: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 18

Evaluation Data

X’s DV:► 1000 hand-annotated sentences► Possible judgments:

■ subj■ obj■ other

► Evaluate classification between subj and non-subj System

Subj Non-subj

Judged

Subj Correct Incorrect

Non-subj Incorrect Correct

Page 19: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 19

Evaluation

DV of X:► 750 hand-annotated sentences► Evaluate classification between obj and non-obj

System

Obj Non-obj

Judged

Obj Correct Incorrect

Non-obj Incorrect Correct

Page 20: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 20

Evaluation

DV X► 999 hand-annotated sentences► Evaluate classification between subj, obj, and none System

Subj Obj Other

Judged

Subj Correct Incorrect Incorrect

Obj Incorrect Correct Incorrect

Other Incorrect Incorrect Correct

Page 21: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 21

Evaluation measures

Error Incorrect

Correct Incorrect

Page 22: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 22

Deverbal-only Results: Possessives

Combining all arguments for each deverbal reduces role-labeling error by 39% for possessive argumentsPossessive arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Subj error Non-subjerror

Overall error

Baseline

Deverbal-onlymodel

Page 23: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 23

Deverbal-only Results: ‘of’ args

Error rate is reduced by 44%'of' arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Obj error Non-objerror

Overall error

Baseline

Deverbal-onlymodel

Page 24: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 24

Deverbal-only Results: prenominal args

Error rate is reduced by 28%

Page 25: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 25

Too much smoothing?

Combining all arguments is fairly drastic

Possible features of arguments that may impact behavior:► Ontological class: hard to get reliable classifications

► Animacy: subjects are more animate than objects (crosslinguistically true) the program’s renewal vs. his renewal

Possible features of deverbals and verbs that may impact behavior:► Ontological class► Active vs. passive use of verbs

Page 26: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 26

Animacy-based model

Split model into separate predictors for animate(X) and inanimate(X)► Animate: pronouns (I, you, he, she, they)► Inanimate: common nouns ► Ignored proper names due to poor classification into

people vs. places vs. organizations

If model does not have a prediction for the class of argument encountered, fall back to deverbal-only model

Results: ► more accurate subject labeling for animate arguments► lower recall and less accurate object labeling► overall error rate is about the same as deverbal-

only model► possibly due to insufficient training data

Page 27: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 27

Lexicalized model

Try to make predictions for individual DV+argument pairs

If the model has insufficient evidence for the pair, default to deverbal-only model

Results:► For possessive args, performance about the same as deverbal-only

► For ‘of’ args, performance slightly worse than deverbal-only

► For prenominal args, much worse performance► Model is vulnerable to data sparseness and systematic parsing errors (e.g. weather conditions)

Page 28: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 28

DV+animacy / lex results: possessives

Possessive arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Subj error Non-subj error Overall error

Baseline

Deverbal-only

Animacy

Lexicalized

Page 29: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 29

DV+animacy / lex results: ‘of’ arguments

'of' arguments

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Obj error Non-obj error Overall error

Baseline

Deverbal-only

Animacy

Lexicalized

Page 30: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 30

DV+lex results: prenominal args

Prenominal arguments

0

0.2

0.4

0.6

0.8

1

1.2

Subj error Obj error Other error Overall error

Baseline

Deverbal-only

Lexicalized

Page 31: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 31

Training data size: 10K vs. 2M docs

00.10.20.30.40.50.60.70.80.9

Possessive "of"

Coverage vs. Size of training data

10K

2M

Error rate vs. Size of training data

0

0.1

0.2

0.3

0.4

0.5

Possessive of'

Baseline10K2M

Page 32: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 32

Support (“light”) verbs

Tried using the same method to derive support verbs► e.g. make a decision, take a walk, receive a gift

Look for patterns likeJohn decided vs. John made a decision => lv(decision, make, sb)We agreed vs. We had an agreement => lv(agreement, have, sb)

Initial predictions had quite a few spurious patterns

After manual curation► 96 DV-V pairs got a support verb► 25 unique support verbs► 28 support verb / argument patterns

Default model fairly fragile

Tight semantic relationship between light verbs and deverbals makes this method less applicable

Page 33: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 33

Directions for future work

Less ad hoc parameter setting

Further lexicalization of the model► Predictions for ontological classes of arguments

► Use properties of verbal constructions (e.g. passive vs. active, tense, etc.)

More fine-grained classification of non-subj/obj roles► director of 12 years► Bill Gates’ foundation► the Delhi Declaration

Page 34: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 34

Conclusions

Knowing how arguments typically participate in events allows interpretation of ambiguous deverbal syntax

Large parsed corpora are a valuable resource

Even the simplest models greatly reduce error

More data is better

Page 35: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 35

Thanks to:

Scott Waterman

Dick Crouch

Tracy Holloway King

Powerset NLE Team

Page 36: From Verbal Argument Structures to Nominal Ones: A Data-Mining Approach Olya Gurevich 1 December 2010

© 2010 Microsoft Page 36

ReferencesM. Banko and E. Brill, Scaling to very very large corpora for natural language

disambiguation, ACL 2001.

S. Riezler, T. H. King, R. Kaplan, J. T. Maxwell. III, R. Crouch, and M. Johnson, Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques, ACL 2002.

S. A. Waterman, Distributed parse mining, in SETQA-NLP 2009.

O. Gurevich, R. Crouch, T. H. King, and V. de Paiva, Deverbal nouns in knowledge representation, Journal of Logic and Computation, vol. 18, pp. 385-404, 2008.

O. Gurevich, S.A. Waterman. Mapping Verbal Argument Preferences to Deverbal Nouns, IJSC 4(1), 2010

M. Nunes, Argument linking in English derived nominals, in Advances in Role and Reference Grammar, R. V. Valin, Ed. John Benjamins, 1993, pp. 375-432.

R. S. Crouch and T. H. King, Semantics via f-structure rewriting, LFG 2006.

C. Macleod, R. Grishman, A. Meyers, L. Barrett, and R. Reeves, NOMLEX: A lexicon of nominalizations, EURALEX 1998.

A. Meyers, R. Reeves, C. Macleod, R. Szekely, V. Zielinska, B. Young, and R. Grishman, The cross-breeding of dictionaries, LREC-2004.

C. Walker and H. Copperman, Evaluating complex semantic artifacts, LREC 2010.

S. Pradhan, H. Sun, W. Ward, J. H. Martin, and D. Jurafsky, Parsing arguments of nominalizations in English and Chinese, HLT-NAACL 2004.

C. Liu and H. T. Ng, Learning predictive structures for semantic role labeling of Nombank, ACL 2007.

M. Lapata, The disambiguation of nominalizations, Computational Linguistics, 28(3),357-388, 2002.

S. Pado, M. Pennacchiotti, and C. Sporleder, Semantic role assignment for event nominalisations by leveraging verbal data, CoLing 2008.