efficient parsing for bilexical cf grammars head automaton grammars jason eisner u. of pennsylvania...
TRANSCRIPT
![Page 1: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/1.jpg)
Efficient Parsing for•Bilexical CF Grammars•Head Automaton Grammars
Jason EisnerU. of Pennsylvania
Giorgio SattaU. of Padova, Italy
U. of Rochester
![Page 2: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/2.jpg)
When’s a grammar bilexical?
If it has rules / entries that mention 2 specific words in a dependency relation:
convene - meetingeat - blintzes ball - bouncesjoust - with
![Page 3: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/3.jpg)
Bilexical Grammars
Instead ofVP V NPor even VP solved NP
use detailed rules that mention 2 heads: S[solved] NP[Peggy] VP[solved] VP[solved] V[solved] NP[puzzle] NP[puzzle] Det[a] N[puzzle]
so we can exclude, or reduce probability of, VP[solved] V[solved] NP[goat] NP[puzzle] Det[two] N[puzzle]
![Page 4: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/4.jpg)
Bilexical CF grammars
Every rule has one of these forms: A[x] B[x] C[y] so head of LHS A[x] B[y] C[x] is inherited from A[x] x a child on RHS.
(rules could also have probabilities)
B[x], B[y], C[x], C[y], ... many nonterminalsA, B, C ... are “traditional nonterminals”x, y ... are words
![Page 5: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/5.jpg)
Bilexicalism at Work
Not just selectional but adjunct preferences: Peggy [solved a puzzle] from the library. Peggy solved [a puzzle from the library].
Hindle & Rooth (1993) - PP attachment
![Page 6: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/6.jpg)
Bilexicalism at Work
Bilexical parsers that fit the CF formalism: Alshawi (1996) - head automataCharniak (1997) - Treebank grammarsCollins (1997) - context-free grammarsEisner (1996) - dependency grammars
Other superlexicalized parsers that don’t:Jones & Eisner (1992) - bilexical LFG parser
Lafferty et al. (1992) - stochastic link parsingMagerman (1995) - decision-tree parsingRatnaparkhi (1997) - maximum entropy parsing
Chelba & Jelinek (1998) - shift-reduce parsing
![Page 7: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/7.jpg)
How bad is bilex CF parsing?
A[x] B[x] C[y]
Grammar size = O(t3 V2) where t = |{A, B, ...}| V = |{x, y ...}|
So CKY takes O(t3 V2 n3) Reduce to O(t3 n5) since relevant V = n
This is terrible ... can we do better? Recall: regular CKY is O(t3 n3)
![Page 8: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/8.jpg)
The CKY-style algorithm
lovesMary the girl outdoors][ [ ][ ] ][
![Page 9: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/9.jpg)
Why CKY is O(n5) not O(n3)
B
i j
C
j+1 kh h’
h
A
i k
O(n3 combinations)
O(n5 combinations)
... hug visiting relatives... advocate visiting relatives
![Page 10: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/10.jpg)
Idea #1
B
i j
C
j+1 kh h’
h’
A
i k
Combine B with what C?
must try different-width C’s (vary k)
must try differently-headed C’s (vary h’)
Separate these!
![Page 11: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/11.jpg)
Idea #1
C
j+1 kh’
h’
A
i k
B
i j
C
j+1 kh h’
h’
A
i k
i jh
B
h’i j
A
C
h’C
(the old CKY way)
![Page 12: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/12.jpg)
Head Automaton Grammars(Alshawi 1996)
[Good old Peggy] solved [the puzzle] [with her teeth] !
The head automaton for solved: a finite-state device can consume words adjacent to it on either side does so after they’ve consumed their dependents
[Peggy] solved [puzzle] [with] (state = V)
[Peggy] solved [with] (state = VP)
[Peggy] solved (state = VP)
solved (state = S; halt)
![Page 13: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/13.jpg)
Formalisms too powerful?
So we have Bilex CFG and HAG in O(n4).HAG is quite powerful - head c can require an c bn:
... [...a3...] [...a2...] [...a1...] c [...b1...] [...b2...] [...b3...] ...
not center-embedding, [a3 [[a2 [[a1] b1]] b2]] b3
Linguistically unattested and unlikely Possible only if the HA has a left-right cycle Absent such cycles, can we parse faster?
(for both HAG and equivalent Bilexical CFG)
![Page 14: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/14.jpg)
Transform the grammar
Absent such cycles, we can transform to a “split grammar”: Each head eats all its right dependents
first I.e., left dependents are more oblique.
This allows
A
i
A
k
ki
h h
A
![Page 15: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/15.jpg)
Idea #2
B
i j
C
j+1 kh h’
h’
A
i k
Combine what B and C?
must try different-width C’s (vary k)
must try different midpoints j
Separate these!
![Page 16: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/16.jpg)
Idea #2
h
A
i k
B
i j
C
j+1 kh h’
h
A
i k
i jh
B C
j+1 h’
C
h’i
A
h kh’
C
(the old CKY way)
![Page 17: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/17.jpg)
Idea #2
h
A
k
B
i j
C
j+1 kh h’
h
A
i k
jh
B C
j+1 h’
C
h’
A
h kh’
C
(the old CKY way)
![Page 18: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/18.jpg)
The O(n3) half-tree algorithm
lovesMary the girl outdoors] [[ ][ ]] [
![Page 19: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/19.jpg)
Theoretical Speedup
n = input length g = polysemyt = traditional nonterms or automaton states
Naive: O(n5 g2 t) New: O(n4 g2 t)Even better for split grammars:
Eisner (1997): O(n3 g3 t2) New: O(n3 g2 t)
all independent of vocabulary size!
![Page 20: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/20.jpg)
Reality check
Constant factor
Pruning may do just as well “visiting relatives”: 2 plausible NP hypotheses i.e., both heads survive to compete - common??
Amdahl’s law much of time spent smoothing probabilities fixed cost per parse if we cache probs for reuse
![Page 21: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/21.jpg)
Experimental Speedup (new!)
Used Eisner (1996) Treebank WSJ parserand its split bilexical grammar
Parsing with pruning: Both old and new O(n3) methods
give 5x speedup over the O(n5) - at 30 words
Exhaustive parsing (e.g., for EM): Old O(n3) method (Eisner 1997)
gave 3x speedup over O(n5) - at 30 words
New O(n3) method gives 19x speedup
![Page 22: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/22.jpg)
3 parsers (pruned)
0
1000
2000
3000
4000
5000
6000
0 10 20 30 40 50 60
Sentence Length
Tim
e
NAIVE
IWPT-97
ACL-99
![Page 23: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/23.jpg)
3 parsers (pruned): log-log plot
y = cx3.8
y = cx2.7
1
10
100
1000
10000
10 100
Sentence Length
Tim
e
NAIVE
IWPT-97
ACL-99
![Page 24: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/24.jpg)
3 parsers (exhaustive)
0
10000
20000
30000
40000
50000
60000
70000
80000
0 10 20 30 40 50 60
Sentence Length
Tim
e
NAIVE
IWPT-97
ACL-99
![Page 25: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/25.jpg)
3 parsers (exhaustive): log-log plot
y = cx5.2
y = cx4.2
y = cx3.3
1
10
100
1000
10000
100000
10 100
Sentence Length
Tim
e
NAIVE
IWPT-97
ACL-99
![Page 26: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/26.jpg)
3 parsers
0
10000
20000
30000
40000
50000
60000
70000
80000
0 10 20 30 40 50 60
Sentence Length
Tim
e
NAIVE
IWPT-97
ACL-99
NAIVE
IWPT-97
ACL-99
![Page 27: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/27.jpg)
3 parsers: log-log plot
1
10
100
1000
10000
100000
10 100
Sentence Length
Tim
e NAIVE
IWPT-97
ACL-99
NAIVE
IWPT-97
ACL-99
pruned
exhaustive
![Page 28: Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e215503460f94b0d606/html5/thumbnails/28.jpg)
Summary
Simple bilexical CFG notion A[x] B[x] C[y]
Covers several existing stat NLP parsers
Fully general O(n4) algorithm - not O(n5) Faster O(n3) algorithm for the “split” caseDemonstrated practical speedup
Extensions: TAGs and post-transductions