building(structures( from classiﬁers for passage...

Building Structures from Classifiers

for Passage Reranking

Aliaksei Severyn1, Massimo Nicosia1, Alessandro Moschi@1,2

kindly presented by: Daniil Mirylenka1 1DISI, University of Trento, Italy

2QCRI, Qatar Founda@on, Doha, Qatar

1 CIKM, 2013

Factoid QA

2

What is Mark Twain's real name?

Factoid QA: Answer Retrieval

3

Roll over, Mark Twain, because Mark McGwire is on the scene.


Samuel Langhorne Clemens, beOer known as Mark Twain.

SEARCH ENGINE KB

Mark Twain couldn't have put it any beOer.

fast recall IR

Factoid QA: Answer Passage Reranking

4




SEARCH ENGINE KB





slower precision NLP/ML

Factoid QA: Answer Extrac@on

5




SEARCH ENGINE KB





slow precision NLP/ML

Encoding ques@on/answer pairs

6

What is Mark Twain's real name? <

Roll over, Mark Twain, because Mark McGwire is on the scene. > ,


Samuel Langhorne Clemens, beOer known as Mark Twain. > ,


7



(0.5, 0.4, 0.3, 0.0, 0.2,…, 1.0)

lexical: n-‐grams, Jaccard sim., etc. syntacKc: dependency path, TED semanKc: WN path, ESA, etc.

Encode q/a pairs via similarity features


8



(0.5, 0.4, 0.3, 0.0, 0.2,…, 1.0)



briMle representaKon


9



(0.5, 0.4, 0.3, 0.0, 0.2,…, 1.0)


Tedious feature engineering


briMle representaKon

Our goal

§  Build an Answer Passage Reranking model that: §  encodes powerful syntac@c paOerns

rela@ng q/a pairs §  requires no manual feature engineering

10

Previous work

Previous state of the art systems on TREC QA build complicated feature-‐based models derived from:

§  Quasi synchronous grammars [Wang et al., 2007] §  Tree Edit Distance [Heilman & Smith, 2010] §  Probabilis@c model to learn TED transforma@ons

on dependency trees [Wang & Manning, 2010] §  CRF + TED features [Yao et al., 2013]

11

Our approach

§  Model q/a pairs explicitly as linguis@c structures §  Rely on Kernel Learning to automaKcally extract

and learn powerful syntac@c paOerns

12

< > , (0.5, 0.2,…, 1.0) ,



Roadmap

§  Learning to rank with kernels §  Preference reranking with kernels §  Tree Kernels

§  Structural models of q/a pairs §  Structural tree representa@ons §  Seman@c Linking to relate ques@on and answer

§  Experiments

13

Preference reranking with kernels

Pairwise reranking approach §  Given a set of q/a pairs {a, b, c, d, e}, where a, c – relevant §  encode a set of pairwise preferences:

a>b, c>e, a>d, c>b, etc. via preference kernel:

14

PK(�a, b�, �c, e�) = �a− b, c− e� =K(a, c)−K(a, e)−K(b, c) +K(b, e)

K(a, c) = K(�Qa, Aa�, �Qc, Ac�) =KTK(Qa, Qc) +KTK(Aa, Ac) +Kfvec(a, c)

where

Compu@ng kernel between q/a pairs

15

< > , (0.5, 0.2,…, 1.0) ,

< > , (0.1, 0.9,…, 0.4) ,

Kfvec KTK KTK

K(a, c) = K(�Qa, Aa�, �Qc, Ac�) =KTK(Qa, Qc) +KTK(Aa, Ac) +Kfvec(a, c)

Tree Kernels

§  Syntac@c and Par@al Tree Kernel (PTK) (Moschil, 2006)

§  PTK generalizes STK (Collins and Duffy, 2002) to generate more general tree fragments

§  PTK is suitable for cons@tuency and dependency structures

16

Structural representa@ons of q/a pairs

§  NLP structures are rich sources of features §  Shallow syntac@c and dependency trees

§  Linking related fragments between ques@on and answer is important: §  Simple string matching (Severyn and Moschil, 2012) §  Seman@c linking (this work)

17

Rela@onal shallow tree [Severyn & Moschil, 2012]

18

< > ,



Seman@c linking

19

NER: Person NER: Personfocus

< > ,



Seman@c linking

20


< > ,

Find ques@on category (QC): HUM

Seman@c linking

21


< > ,

Find focus (FC): name


Seman@c linking

22


< > ,

Find en@@es according to ques@on category in the answer passage (NER)



Seman@c linking

23


< > ,



Link focus word and named en@ty tree fragments

Find en@@es according to ques@on category in the answer passage (NER)

Ques@on and Focus classifiers

§  Trained with same Tree Kernel learning technology (SVM)

§  No feature engineering §  State-‐of-‐the-‐art performance

24

Feature Vector Representa@on

§  Lexical §  Term-‐overlap: n-‐grams of lemmas, POS tags,

dependency triplets

§  SyntacKc §  Tree kernel score over shallow syntac@c and

dependency trees

§  QA compaKbility §  QuesKon category §  NER relatedness – propor@on of NER types related to

the ques@on category

25

Experiments and models

Data §  TREC QA 2002 & 2003 (824 ques@ons) §  Public benchmark on TREC 13 [Wang et al., 2007]

Baselines §  BM25 model from IR §  CH -‐ shallow tree [Severyn & Moschil, 2012] §  DEP – dependency tree §  V -‐ similarity feature vector model Our approach §  +F -‐ seman@c linking

26

Structural representa@ons on TREC QA

27

BM25

V

CH

+V

+V+F

DEP

+V

+V+F 0.31

0.30

0.30

0.32

0.30

0.28

0.22

0.22

MAP

37.49

37.64

37.87

39.48

37.45

35.63

28.40

28.02

MRR

28.93

28.05

28.05

29.63

27.91

24.88

18.54

18.17

P@1


28

BM25

V

CH

+V

+V+F

DEP

+V

+V+F 0.31

0.30

0.30

0.32

0.30

0.28

0.22

0.22

MAP

37.49

37.64

37.87

39.48

37.45

35.63

28.40

28.02

MRR

28.93

28.05

28.05

29.63

27.91

24.88

18.54

18.17

P@1


29

BM25

V

CH

+V

+V+F

DEP

+V

+V+F 0.31

0.30

0.30

0.32

0.30

0.28

0.22

0.22

MAP

37.49

37.64

37.87

39.48

37.45

35.63

28.40

28.02

MRR

28.93

28.05

28.05

29.63

27.91

24.88

18.54

18.17

P@1

Comparing to state-‐of-‐the-‐art on TREC 13

§  Manually curated test collec@on from TREC 13 [Wang et al., 2007]

§  Used as a public benchmark to compare state-‐of-‐the-‐art systems on TREC QA

§  Use 824 ques@ons from TREC 2002-‐2003 to train and TREC 13 to test

§  Use strong Vadv feature baseline (word overlap, ESA, Transla@on model, etc.)

30

Comparing to state-‐of-‐the-‐art on TREC 13

31

Wang et al., 2007

Heilman & Smith, 2010

Wang & Manning, 2010

Yao et al., 2013

Vadv

CH+Vadv

+F 68.29

66.11

56.27

63.07

59.51

60.91

60.29

MAP

75.20

74.19

62.94

74.77

69.51

69.17

68.52

MRR

Conclusions

§  Treat q/a pairs directly encoding them into linguisKc structures augmented with seman@c informa@on

§  Structural kernel technology to automaKcally extract and learn syntac@c/seman@c features

§  SemanKc linking using ques@on and focus classifiers (trained with same tree kernel technology) and NERs

§  State-‐of-‐the-‐art results on TREC 13

32

Thanks for your aMenKon!

33

BACKUP SLIDES

35

Kernel Answer Passage reranker

Search Engine

Kernel-based reranker

Rerankedanswers

Candidate answersQuery

Evaluation

UIMA pipeline

NLP annotators

Focus and Question classifiers

syntactic/semantic graph

q/a similarity features

train/test data

36

Question Category Named Entity typesHUM PersonLOC LocationNUM Date, Time, Money, PercentageENTY Organization, Person

Seman@c Linking

§  Use Ques@on Category (QC) and Focus Classifier (FC) to find ques@on category and focus word

§  Run NER on the answer passage text §  Connect focus word with related NERs (according

to the ques@on category) in the answer

37

Ques@on Classifier

§  Tree kernel SVM mul@-‐classifier (one-‐vs-‐all) §  6 coarse classes from Li & Roth, 2002:

§  ABBR, DESC, ENTY, HUM, LOC, NUM

§  Data §  5500 ques@ons from UIUIC [Li & Roth, 2002]

38

Dataset STK PTKUIUIC 86.1 82.2TREC test 79.3 78.1

Focus classifier

§  Tree Kernel SVM classifier §  Train:

§  Posi@ve examples: label parent and grandparent nodes of the focus word with FC tag

§  Nega@ve examples: label all other cons@tuent nodes with FC tag

§  Test: §  Generate a set of candidate trees labeling parent and grandparen

nodes of each word in a tree with FC §  Select the tree and thus the focus word associated with the highest

SVM score

39

Focus classifier: genera@ng candidates

40

-‐1 +1 §  Tree Kernel SVM classifier

Accuracy of focus classifer

§  Ques@on Focus §  600 ques@ons from SeCo-‐600 [Quarteroni et al., 2012] §  250 ques@ons from GeoQuery [Damjanovic et al. 2010] §  2000 ques@ons from [Bunescu & Hang, 2010]

41

Dataset ST STK PTKMooney 73.0 81.9 80.5SeCo-600 90.0 94.5 90.0Bunescu 89.7 98.3 96.9

building(structures( from classiﬁers for passage...

Documents