dependency hashing for n-best ccg parsing

15
1 Dependency Hashing for n-best CCG Parsing Dominick Ng and James R. Curran Presented by Yun Huang

Upload: holleb

Post on 05-Feb-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Dependency Hashing for n-best CCG Parsing. Dominick Ng and James R. Curran Presented by Yun Huang. CCG derivation Dependency Evaluation All components of a dep. structure must match golden standard Prec./Recall/F-score. Background: CCG. Background: CCGbank. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dependency Hashing  for n-best CCG Parsing

1

Dependency Hashing for n-best CCG Parsing

Dominick Ng and James R. Curran

Presented by Yun Huang

Page 2: Dependency Hashing  for n-best CCG Parsing

2

Background: CCG

• CCG derivation• Dependency

• Evaluation– All components of a de

p. structure must match golden standard

– Prec./Recall/F-score

Page 3: Dependency Hashing  for n-best CCG Parsing

3

Background: CCGbank

• CCGbank was created by converting the phrase-structure trees in the PTB into normal-form CCG derivations. (99.44% covered)

Page 4: Dependency Hashing  for n-best CCG Parsing

4

Background: C&C parser

• Supertagger: assign possible lexical categories to word (eg. S\NP, (S\NP)/PP for swim)– Tag dictionary extracted from training data– Adaptive supertagging: β and k

• C&C parser: log-linear model parser– POS tags and lexical categories as input.– CKY chart parsing– N-best reranking

Page 5: Dependency Hashing  for n-best CCG Parsing

5

Ambiguity in n-best CCG parsing

• Spurious ambiguity– Norm-form (usually right branching)

• Absorption ambiguity

• Diversity problem: n-best CCG derivations, but with duplicated dependencies

Page 6: Dependency Hashing  for n-best CCG Parsing

6

Dependency Hashing (1)

• Constraint: any n-best candidate must not have the same dependencies as any candidate already in the list.– Similar in SMT: remove duplicated strings– Delete which: later inserted? lower score?

Page 7: Dependency Hashing  for n-best CCG Parsing

7

Dependency Hashing (2)

• Implementation:– 32-bit hash value for each dependency

– Bit-wise XOR to combine sub-derivations– Only hash value, no hash table

• Collision: miss some useful dependencies

Page 8: Dependency Hashing  for n-best CCG Parsing

8

Diversity experiments

• Dependency

• Grammatical relation

Page 9: Dependency Hashing  for n-best CCG Parsing

9

Parsing Results

• Oracle– Reranking u

pper bound

• Reranking

Gap

Page 10: Dependency Hashing  for n-best CCG Parsing

10

Three types of error

• Grammar error– Only a subset of CCGbank rules are used– Seen rule constraint

• Supertagger error– Restricted categories by frequency cutoff – Probability threshold βand cutoff k

• Model error– Suboptimal parse

Page 11: Dependency Hashing  for n-best CCG Parsing

11

Grammar Error

• Given gold-standard categories, the parser F-score is 99.49%, with 95.61% coverage

• Grammar error accounts about 0.5% of overall parser errors, and 4.4% drop in coverage

Page 12: Dependency Hashing  for n-best CCG Parsing

12

Supertagger and model error

• Supertagger error : differ from oracle• Model error : differ from baseline

Page 13: Dependency Hashing  for n-best CCG Parsing

13

More experiments

• Tradeoff of speed and accuracy

• Gold/automatic

POS tags

Page 14: Dependency Hashing  for n-best CCG Parsing

14

Conclusion

• Dependency hashing for n-best CCG– Avoid derivations with same dependency– Increase diversity in n-best list

• Comprehensive error analysis– Grammar error: 0.5%– Supertagger error: 5%– Model error: 7.5%

Page 15: Dependency Hashing  for n-best CCG Parsing

15

Thank you

Q & A