ling 581: advanced computational linguistics lecture notes january 30th

Post on 31-Mar-2015

225 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

LING 581: Advanced Computational Linguistics

Lecture NotesJanuary 30th

Relative clause constructions

• Terminology– gap (__):

• indicates where the head of the construction is interpreted

– Subject RC: the man (that|who) __ saw me– Object RC: the man (that|who) I saw __– Subject and object RCs can appear in subject and object

positions freely:• The man that saw me left the room• The man that I saw left the room• I saw the man that saw me• I again saw the man that I sawNote: the relative pronoun is the that/who/which

Relative clause constructions

• Terminology contd.:– Infinitival/untensed vs. tensed• John saw Mary (tensed)• John sees Mary (tensed)• John to see Mary (untensed)

– In RC constructions:• the man to see Mary• a person to see• a time to go see Mary

Note: subject is always missing…But it’s not always the RC gap

Relative clause constructions

• Terminology contd.:– Zero refers to a missing relative pronoun– Zero RCs:

• the man I saw (tensed)• the man to see (untensed)

– *Zero:• *the man saw me / the man who saw me• *the man was seen by me / the man who was seen by me• The horse raced past the barn fell

– must be zero:• *a person that to see• *the man that to see Mary

Homework Exercise

Subject Non-Subject

Tensed relatives

Untensed relatives

Frequency counts

that which/who/what/when/where

zero

Tensed relatives

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:2. zero relative clauses

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:2. zero relative clauses

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Homework Exercise Review

• From page 17:

Homework Exercise Review

• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses Two subtypes:

WHNP NP-traceWHADVP ADVP-trace

Note: the format in the guide doesn’t always match exactly with WSJ trees … -NONE-

Homework Exercise Review

• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses

Matches Pattern11598 @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) << (@NP < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

1. 2.

3.

Homework Exercise Review

• Browsing through the matches and refining the search is always a good idea …

to see what we have inadvertently picked up or have not thought of

Homework Exercise Review

• Note: 2nd matching tree has an intervening PP:

Homework Exercise Review

• Note: 5th matching tree has an intervening PP:

Note: intervening punctuation is also commonThe plant, which is owned by Hollingsworth & Vose Co., was under contract …

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

Note: the SBAR from NP-SBJ was extraposed to the VP

Note: *ICH* non-subject relative clause

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

This is NOT a relative clauseconstruction!

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

The relative clause gap here is ADVP

Infinitival/non-tensed clause

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

*ICH* subject relative clause

Note: the SBAR from the NP objectwas right extraposed to the VP

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

CoordinationSBAR SBAR CC SBAR

Homework Exercise Review

• 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)

$/#2%i)

• Excludes *ICH* cases• Excludes coordination …

Homework Exercise Review

• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i)• 10326 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i

<< (/^(NP|ADVP)/ < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

Homework Exercise Review• 8575 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ

< /^-NONE-$/))• 5975 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ <

(/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

Homework Exercise Review

Let’s look at the *ICH* subcases:

Homework Exercise Review

159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))

Homework Exercise Review

159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))

This is NOT a relative clauseconstruction!

Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i

Only 1 out of the 4 is NOT a relative clauseconstruction!

Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i

Search string is too restrictive:SBAR-PRPSBAR-NOM

Homework Exercise Review• 116 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-

([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/)• 115 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-

([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/#2%j << /\*T\*-([0-9]+)/#1%j)

Not a trace?BUG?

Relevance of Treebanks

• Statistical parsers typically construct syntactic phrase structure– they’re trained on Treebank corpora like the Penn

Treebank• Note: some use dependency graphs, not trees

Parsers trained on the Treebank

• Don’t recover fully-annotated trees– not trained using nodes with indices or empty (-NONE-) nodes– not trained using functional tags, e.g. –SBJ

• Therefore they don’t fully parse• Example: no SBAR node in … a movie to see

Stanford parser

Parsers trained on the Treebank

• SBAR can be forced by the presence of an overt relative pronoun, but note there is no subject gap:

Parsers trained on the Treebank

• Probabilities are estimated from frequency information of each node given surrounding context (e.g. parent node, or the word that heads the node)

• Still these systems have enormous problems with prepositional phrase (PP) attachment

• Example:(borrowed from Igor Malioutov)

– A boy with a telescope kissed Mary on the lips– Mary was kissed by a boy with a telescope on the lips

• PP with a telescope should adjoin to the noun phrase (NP) a boy• PP on the lips should adjoin to the verb phrase (VP) headed by

kiss

Active/passive sentences

• Examples using the Stanford Parser:

Both active and passivesentences are parsed incorrectly

Active/passive sentences

• Examples:

X on the lips modifies MaryX on the lips modifies telescope

Homework Exercise• Use tregex to find out how many passive sentences there are in

the Treebank WSJ section?• The passive construction (according to the Bracketing Guidelines)

– Note: by-phrase containing logical subject (LGS) is optional

top related