weakly-supervised grammar-informed bayesian ccg parser ... · weighted category grammar + + np n n...
TRANSCRIPT
Weakly-Supervised Grammar-Informed
Bayesian CCG Parser Learning
Dan Garrette Chris Dyer
Jason Baldridge Noah A. Smith
UT-Austin CMU UT-Austin CMU
Annotating parse trees by hand is extremely difficult.
Motivation
Motivation
Can we learn new parsers cheaply?
(cheaper = less supervision)
Motivation
When supervision is scarce, we have to be smarter about data.
Type-Level Supervision
Type-Level Supervision
• Unannotated text
• Incomplete tag dictionary: word � {tags}
Type-Level Supervision
Used for part-of-speech tagging for 20+ years
[Kupiec, 1992] [Merialdo, 1994]
Type-Level Supervision
Good tagger performance even with low supervision
[Ravi & Knight, 2009] [Das & Petrov, 2011] [Garrette & Baldridge, 2013] [Garrette et al., 2013]
Combinatory Categorial Grammar
(CCG)
CCG
Every word token is associated with a category
Categories combine to form categories of larger constituents
[Steedman, 2000] [Steedman and Baldridge, 2011]
the dog
n
np
n/np
CCG
s
dogs
np
sleep
\s
CCG
np
Type-Supervised CCG
the lazy dogs
np/n n
np
n/n
np
wander
(s\np)/np
n
n/n
np/n
s\np…
nnp npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
nnp npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
nnp npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
n
n
np npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
np
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
np
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
np
/n n/ \s
CCG Parsing
n
nn
np npn
wanderthe lazy dogs
np
/n n/ \s
s
CCG Parsing
Why CCG?
Machine Translation[Weese, Callison-Burch, and Lopez, 2012]
Semantic Parsing[Zettlemoyer and Collins, 2005]
Type-supervised learning for CCG is highly ambiguous
Penn Treebank parts-of-speech
CCGBank Categories
48 tags 1,300+ categories
Type-Supervised CCG
The grammar formalism itself can be used to guide learning
Our Strategy
Our Strategy
Incorporate universal knowledge about grammar into learning
Universal Knowledge
the lazy dognp/n (np\(np/n))/n n
np\(np/n)
np
the lazy dognp/n n/n n
n
np
Prefer Simpler Categories
the lazy dognp/n (np\(np/n))/n n
np\(np/n)
np
the lazy dognp/n n/n n
n
np
Prefer Simpler Categories
buy := (((sb\np)/pp)/pp)/np
appears 342 times in CCGbankbuy := (sb\np)/np
appears once
e.g. “Opponents don't buy such arguments.”
“Tele-Communications agreed to buy half of Showtime Networks from Viacom for $ 225 million.”pp pp
Prefer Simpler Categories
transitive verb: (he) hides (the money)
(sb\np)/np
Prefer Modifier Categories
((sb\np)/np)/((sb\np)/np)
adverb: (he) quickly (hides) (the money)
a {s, np, n,…}
A B / B
A
B \ B
patom(a)
A
A B \ C
B / C
× pterm
pterm
pterm
pterm
pterm
× pfwd
× pfwd
× pfwd
× pfwd
× pmod
× pmod
× pmod
× pmod
Weighted Category Grammar
a {s, np, n,…}
A B / B
A
B \ B
patom(a)
A
A B \ C
B / C
× pterm
pterm
pterm
pterm
pterm
× pfwd
× pfwd
× pfwd
× pfwd
× pmod
× pmod
× pmod
× pmod
Weighted Category Grammar
+
+
nnpnp npn
wanderthe lazy dogs
/nn n/ \ss
nn
np
s
Prefer Likely Categories
nnpnp npn
wanderthe lazy dogs
/nn n/ \ss
nn
np
s
Prefer Likely Categories
Type-Supervised Learning
unlabeled corpus
tag dictionary
universal properties of the CCG formalism
same as POS tagging
Posterior Inference
[Johnson, Griffiths, and Goldwater, 2007]
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Inside
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Inside
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Sample
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Sample
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Posterior Inference
the lazy dogsnp/n n
npn/nnp
wander
(s\np)/np
nn/n
np/ns\np…
Priors (simple is
good)
PCFG
Results
0
25
50
75
English Chinese Italian
UniformWith Prior
0
25
50
75
English Chinese Italian
UniformWith Prior
CCG Parsing Results
55.7
42.0
60.0
pars
ing
accu
racy
53.4
35.9
58.2
Conclusion
Using universal grammatical knowledge can make better use of weak supervision