tree kernel-based semantic relation extraction using unified dynamic relation tree

24
Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree Reporter: Longhua Qian School of Computer Science and Technology Soochow University, Suzhou, China 2008.07.23 ALPIT2008, DaLian, China

Upload: falala

Post on 03-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree. Reporter: Longhua Qian School of Computer Science and Technology Soochow University , Suzhou, China 2008.07.23 ALPIT2008, DaLian, China. Outline. 1. Introduction 2. Dynamic Relation Tree - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Tree Kernel-based Semantic Relation

Extraction using Unified Dynamic

Relation Tree

Reporter: Longhua QianSchool of Computer Science and Technology

Soochow University, Suzhou, China2008.07.23

ALPIT2008, DaLian, China

Page 2: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Outline 1. Introduction 2. Dynamic Relation Tree 3. Unified Dynamic Relation Tree 4. Experimental results 5. Conclusion and Future Work

Page 3: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

1. Introduction Information extraction is an important

research topic in NLP. It attempts to find relevant information

from a large amount of text documents available in digital archives and the WWW.

Information extraction by NIST ACE Entity Detection and Tracking (EDT) Relation Detection and Characterization

(RDC) Event Detection and Characterization

(EDC)

Page 4: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

RDC Function

RDC detects and classifies semantic relationships (usually of predefined types) between pairs of entities. Relation extraction is very useful for a wide range of advanced NLP applications, such as question answering and text summarization.

E.g. The sentence “Microsoft Corp. is based in

Redmond, WA” conveys the relation “GPE-AFF.Based” between “Microsoft Corp” (ORG) and “Redmond” (GPE).

Page 5: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Two approaches Feature-based methods

have dominated the research in relation extraction over the past years. However, relevant research shows that it’s difficult to extract new effective features and further improve the performance.

Kernel-based methods compute the similarity of two objects (e.g. parse

trees) directly. The key problem is how to represent and capture structured information in complex structures, such as the syntactic information in the parse tree for relation extraction?

Page 6: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Kernel-based related work

Zelenko et al. (2003), Culotta and Sorensen (2004), Bunescu and Mooney (2005) described several kernels between shallow parse trees or dependency trees to extract semantic relations.

Zhang et al. (2006), Zhou et al. (2007) proposed composite kernels consisting of an linear kernel and a convolution parse tree kernel, and the latter can effectively capture structured syntactic information inherent in parse trees.

Page 7: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Structured syntactic information

A tree span for relation instance a part of a parse tree used to represent the

structured syntactic information for relation extraction.

Two currently used tree spans PT(Path-enclosed Tree): the sub-tree enclosed by

the shortest path linking the two entities in the parse tree

CSPT(Context-Sensitive Path-enclosed Tree): Dynamically determined by further extending the necessary predicate-linked path information outside PT.

Page 8: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Current problems Noisy information

Both PT and CSPT may still contain noisy information. In other words, more noise should be pruned away from a tree span.

Useful information CSPT only captures part of context-

sensitive information only relating to predicate-linked path. That is to say, more information outside PT/CSPT may be recovered so as to discern their relationships.

Page 9: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Our solution Dynamic Relation Tree (DRT)

Based on PT, we apply a variety of linguistics-driven rules to dynamically prune out noisy information from a syntactic parse tree and include necessary contextual information.

Unified Dynamic Relation Tree (UDRT) Instead of constructing composite kernels, various

kinds of entity-related semantic information, including entity types/sub-types/mention levels etc., are unified into a Dynamic Relation Tree.

Page 10: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

2. Dynamic Relation Tree

Generation of DRT Starting from PT, we further apply three

kinds of operations (i.e. Remove, Compress, and Expansion) sequentially to reshaping PT, giving rise to a Dynamic Relation Tree at last.

Remove operation DEL_ENT2_PRE: Removing all the

constituents (except the headword) of the 2nd entity

DEL_PATH_ADVP/PP: Removing adverb or preposition phrases along the path

Page 11: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

DRT(cont’) Compress operation

CMP_NP_CC_NP: Compressing noun phrase coordination conjunction

CMP_VP_CC_VP: Compressing verb phrase coordination conjunction

CMP_SINGLE_INOUT: Compressing single in-and-out nodes

Expansion operation EXP_ENT2_POS: Expanding the possessive

structure after the 2nd entity EXP_ENT2_COREF: Expanding entity

coreferential mention before the 2nd entity

Page 12: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Some examples of DRTf ami l i es of seven other former hostages

NNS I N CD J J J J

PP

NP NP

NP

E1-PER

of hostages

NNS I N NNS

PP

NP NP

NP

E2-PERE1-PER

f ami l i es

NNS

E2-PER

(a) Removal of consti tuents before the 2nd enti ty

(b) Compressi on of NP coordi nati on

NP

NP

E1-PER

NN

one

PP

I N

of

NP

NP

DT

the

E2-GPE

NN

town

POS

' s

E-FAC

NN

pl antstwo

CD

NP

NP

E1-FAC

NN

one

PP

I N

of

NP

NP

E2-GPE

NN

town

POS

' smeat-packi ng

J J

governors f rom connecti cut

NNS I N

PP

NP

NP

NP

E1-PER

NP

E-GPE

NNP

,

,

south

NP

E-GPE

NNP

dakota

NNP

,

,

and

CC

montana

NP

E2-GPE

NNP

governors f rom

NNS I N

PP

NP

NP

E1-PER

montana

NP

E2-GPE

NNP

(c) expansi on of possessi ve tag ri ght af ter NP

Page 13: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

3.Unified Dynamic Relation Tree

T1: DRT

T2: UDRT-Bottom

T3: UDRT-Entity

T4: UDRT-TopTP2TP1

PER GPE

T4: UDRT-Top

presi dent of mexi co

NN I N NNP

PP

NP

E2

NP

E1

NP

TPTP

PER GPE

T2: UDRT-Bottom

presi dent of mexi co

NN I N NNP

PP

NP

E2

NP

E1

NP

T3: UDRT-Enti ty

presi dent of mexi co

NN I N NNP

PP

NP

E2-GPE

NP

E1-PER

NP

T1: DRT

presi dent of mexi co

NN I N NNP

PP

NP

E2

NP

E1

NP

Page 14: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Four UDRT setupsT1: DRT

there is no entity-related information except the entity order (i.e. “E1” and “E2”).

T2: UDRT-Bottomthe DRT with entity-related information attached at the bottom of two entity nodes

T3: UDRT-Entity the DRT with entity-related information attached in entity nodes

T4: UDRT-Topthe DRT with entity-related feature attached at the top node of the tree.

Page 15: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

4. Experimental results Corpus Statistics

The ACE RDC 2004 data contains 451 documents and 5702 relation instances. It defines 7 entity major types, 7 major relation type and 23 relation subtypes.

Evaluation is done on 347 (nwire/bnews) documents and 4307 relation instances using 5-fold cross-validation.

Corpus processing parsed using Charniak’s parser (Charniak, 2001) Relation instances are generated by iterating

over all pairs of entity mentions occurring in the same sentence.

Page 16: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Classifier Tools

SVMLight (Joachims 1998) Tree Kernel Tooklits (Moschitti 2004) The training parameters C (SVM) and λ

(tree kernel) are also set to 2.4 and 0.4 respectively.

One vs. others strategy which builds K basic binary classifiers

so as to separate one class from all the others.

Page 17: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Contribution of various operation rules

Each operation rule is incrementally applied on the previously derived tree span.

The plus sign preceding a specific rule indicates that this rule is useful and will be added automatically in the next round.

Otherwise, the performance is unavailable.

Operation rules P R F

PT (baseline) 76.3 59.8 67.1+DEL_ENT2_PRE 76.3 62.1 68.5 DEL_PATH_PP - - - DEL_PATH_ADVP - - -+CMP_SINGLE_INOUT

76.4 63.1 69.1

+CMP_NP_CC_NP 76.1 63.3 69.1 CMP_VP_CC_VP - - -+EXP_ENT2_POS 76.6 63.8 69.6

+EXP_ENT2_COREF 77.1 64.3 70.1

Page 18: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Comparison of different UDRT setups

Compared with DRT, the Unified Dynamic Relation Trees (UDRTs) with only entity type information significantly improve the F-measure by average 10 units due to the increase both in precision and recall.

Among the three UDRTs, UDRT-Top achieves slightly better performance than the other two.

Tree Setups P R F

DRT 68.7 53.5 60.1UDRT-Bottom 76.2 64.4 69.8UDRT-Entity 77.1 64.3 70.1UDRT-Top 76.4 65.2 70.4

Page 19: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Improvements of different tree setups over PT

Dynamic Relation Tree (DRT) performs better that CSPT/PT setups.

the Unified Dynamic Relation Tree with entity-related semantic features attached at the top node of the parse tree performs best.

Tree Setups P R F

CSPT over PT 1.5 1.1 1.3DRT over PT 0.1 5.4 3.3UDRT-Top over PT

3.9 9.4 7.2

Page 20: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Comparison with best-reported systems

It shows that our UDRT-Top performs best among tree setups using one single kernel, and even better than the two previous composite kernels.

Systems P R F Systems P R F

Zhou et al.:Composite kernel

82.2 70.2 75.8 Ours:

CTK with UDRT-Top80.2 69.2 74.3

Zhang et al.:Composite kernel

76.1 68.4 72.1 Zhou et al.: CS-CTK with CSPT 81.1 66.7 73.2

Zhao and GrishmanComposite kernel

69.2 70.5 70.4 Zhang et al.:

CTK with PT 74.1 62.4 67.7

Page 21: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

5. Conclusion Dynamic Relation Tree (DRT), which is

generated by applying various linguistics-driven rules, can significantly improve the performance over currently used tree spans for relation extraction.

Integrating entity-related semantic information into DRT can further improve the performance, esp. when they are attached at the top node of the tree.

Page 22: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

Future Work we will focus on semantic matching

in computing the similarity between two parse trees, where semantic similarity between content words (such as “hire” and “employ”) would be considered to achieve better generalization.

Page 23: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree

References Bunescu R. C. and Mooney R. J. 2005. A Shortest Path Dependency Kernel for Relation

Extraction. EMNLP-2005 Chianiak E. 2001. Intermediate-head Parsing for Language Models. ACL-2001 Collins M. and Duffy N. 2001. Convolution Kernels for Natural Language. NIPS-2001 Collins M. and Duffy, N. 2002. New Ranking Algorithm for Parsing and Tagging:

Kernel over Discrete Structure, and the Voted Perceptron. ACL-02 Culotta A. and Sorensen J. 2004. Dependency tree kernels for relation extraction.

ACL’2004. Joachims T. 1998. Text Categorization with Support Vector Machine: learning with

many relevant features. ECML-1998 Moschitti A. 2004. A Study on Convolution Kernels for Shallow Semantic Parsing. ACL-

2004 Zelenko D., Aone C. and Richardella A. 2003. Kernel Methods for Relation Extraction.

Journal of MachineLearning Research. 2003(2): 1083-1106 Zhang M., , Zhang J. Su J. and Zhou G.D. 2006. A Composite Kernel to Extract

Relations between Entities with both Flat and Structured Features. COLING-ACL’2006.

Zhao S.B. and Grisman R. 2005. Extracting relations with integrated information using kernel methods. ACL’2005.

Zhou G.D., Su J., Zhang J. and Zhang M. 2005. Exploring various knowledge in relation extraction. ACL’2005.

Page 24: Tree Kernel-based Semantic Relation Extraction using  Unified Dynamic Relation Tree