mining dependency relations for query expansion in passage retrieval renxu sun, chai-huat ong,...
TRANSCRIPT
![Page 1: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/1.jpg)
Mining Dependency Mining Dependency Relations for Query Relations for Query
Expansion in Passage Expansion in Passage RetrievalRetrieval
Renxu Sun, Chai-Huat Ong, Tat-Seng ChuaRenxu Sun, Chai-Huat Ong, Tat-Seng ChuaNational University of SingaporeNational University of Singapore
SIGIR2006SIGIR2006
![Page 2: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/2.jpg)
22
IntroductionIntroduction
Query expansionQuery expansion (QE) is a method for (QE) is a method for improving the effectiveness of IRimproving the effectiveness of IR– by providing additional contextual by providing additional contextual
information to the original queriesinformation to the original queries
Traditional passage retrieval algorithms Traditional passage retrieval algorithms perform a perform a density baseddensity based weighting of query weighting of query termsterms– prefer passages containing query terms that prefer passages containing query terms that
are close together are close together
![Page 3: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/3.jpg)
33
IntroductionIntroduction
Local Context AnalysisLocal Context Analysis (LCA) [Croft, 1996] (LCA) [Croft, 1996]– A common QE technique based on term co-occuA common QE technique based on term co-occu
rrence statisticsrrence statistics– utilizes only statistical information instead of seutilizes only statistical information instead of se
mantic informationmantic information– unable to differentiate between noisy and good unable to differentiate between noisy and good
quality expansion termsquality expansion terms
![Page 4: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/4.jpg)
44
IntroductionIntroduction
[Cui et al., 2005][Cui et al., 2005]– The use of a The use of a fuzzy dependency relation matchingfuzzy dependency relation matching
method for passage retrievalmethod for passage retrieval
– significant improvement in MRR over the significant improvement in MRR over the density based passage retrieval systemsdensity based passage retrieval systems
– This work points towards the importance of This work points towards the importance of performing syntactical analysisperforming syntactical analysis
– The longer queries benefit more from this The longer queries benefit more from this methodmethod
Query expansion is needed for short queriesQuery expansion is needed for short queries
![Page 5: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/5.jpg)
55
IntroductionIntroduction
The main contribution of this paper is The main contribution of this paper is employing a relation based model to employing a relation based model to perform:perform:– contextual term selectioncontextual term selection to enhance density to enhance density
based passage retrievalbased passage retrieval– relation extractionrelation extraction to enhance the fuzzy to enhance the fuzzy
dependency relation matching approachdependency relation matching approach
To make the expansion process more To make the expansion process more robust, it extracts relations and terms robust, it extracts relations and terms from external corpus (web).from external corpus (web).
![Page 6: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/6.jpg)
66
Query Expansion Query Expansion Based on Dependency Based on Dependency RelationRelation
Fig. Framework Fig. Framework ofof Relation Based Query Relation Based Query Expansion Expansion
![Page 7: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/7.jpg)
77
Dependency Relation Dependency Relation Paths from Web Paths from Web SnippetsSnippets The Web is considered as a parallel corpus:The Web is considered as a parallel corpus:
1.1. Send the queries to Google and collect the top Send the queries to Google and collect the top kk snippets snippets
2.2. Each sentence is considered as a passage, and Each sentence is considered as a passage, and each snippet contains 2 sentences on average each snippet contains 2 sentences on average ((kk=100, similar to LCA [Croft, 1996])=100, similar to LCA [Croft, 1996])
3.3. Use Minipar, a dependency grammar parser, to Use Minipar, a dependency grammar parser, to parse the passages.parse the passages.
![Page 8: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/8.jpg)
88
Examples of Parse Examples of Parse TreeTree
Fig. The parse trees of the sample question and sentence, Fig. The parse trees of the sample question and sentence, <When, wha, head, purchased> is a relation path. The <When, wha, head, purchased> is a relation path. The directions of relations are ignored in experiments.directions of relations are ignored in experiments.
![Page 9: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/9.jpg)
99
Term Expansion for Term Expansion for Density Based Retrieval Density Based Retrieval System (1/2)System (1/2) Ranking candidate expanded termsRanking candidate expanded terms
– A variant formula of that in LCAA variant formula of that in LCA – Global importanceGlobal importance
IDF of the expanded termIDF of the expanded term
– Local importanceLocal importance The relation path linking to the query termThe relation path linking to the query term
Adding the top Adding the top kk terms to the original terms to the original query with weight (1-0.9*query with weight (1-0.9*i i //kk))
![Page 10: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/10.jpg)
1010
Term Expansion for Density BTerm Expansion for Density Based Retrieval System (2/2)ased Retrieval System (2/2)
Qt
idf
n
j
ikT
k
i
t
k
i
N
jtTscorepathidf
QTScore )log
)),,(_(log
(),(10
10
1
where Tk = the term to be ranked;idfTk=max(1.0, log10(N / NTk));idfti=max(1.0, log10(N / Nti));
pj = the jth passage in the passage set P;
score(Reli) = the score of individual relation which is obtained through training
δ is set to 0.1 to avoid zero values
),,(
)(),,(_
jtTpathRelPtPT
ik
ki
jjk
RelscorejtTscorepath
![Page 11: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/11.jpg)
1111
Relation Based Relation Based Retrieval Method Retrieval Method (RBM)(RBM) RBM is used to perform passage re-ranking based on tRBM is used to perform passage re-ranking based on t
he initial retrieval result obtained by the density based he initial retrieval result obtained by the density based method (DBM).method (DBM).
The similarity between passage S and Q is computed bThe similarity between passage S and Q is computed by finding all possible relation path pairs (y finding all possible relation path pairs (PPSS, , PPQQ) from S ) from S and Q that have and Q that have the same starting and ending nodesthe same starting and ending nodes..
The translation probability The translation probability ProbProb((PPSS||PPQQ) is the sum over ) is the sum over all possible alignments:all possible alignments:
m
a
m
a
n
i
Qa
Sit
nQS
l n
iRelRelPm
PPProb1 1 1
)()( )|()|(
![Page 12: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/12.jpg)
1212
Relation Path Relation Path ExpansionExpansion A technique to be used on top of the fuzzy relation baseA technique to be used on top of the fuzzy relation base
d retrieval [Cui, 2005]d retrieval [Cui, 2005] The path expansion technique extracts additional relatiThe path expansion technique extracts additional relati
on paths linking the expanded terms with original query on paths linking the expanded terms with original query terms.terms.
Select the path associated with Select the path associated with TTkk that has the maximu that has the maximum m path_scorepath_score((TTkk,,tt,,jj)) to be expanded, weighted by (1-0.9* to be expanded, weighted by (1-0.9*i i //kk))
}},,(_{max),,(_|),,({
)(_
1
jtTscorepathjtTscorepathjtTpath
Texpath
k
njQt
kk
k
![Page 13: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/13.jpg)
1313
Model TrainingModel Training
Retrieve the top 100 snippets from Google for each Retrieve the top 100 snippets from Google for each QQi i .. A path <A path <Start_NodeStart_Node, Rel, Rel11, …, Rel, …, Relmm, , End_NodeEnd_Node> in the snippe> in the snippe
ts is “relevant” ifts is “relevant” if– The relevant paths are those inferring a useful term to the queThe relevant paths are those inferring a useful term to the que
stion.stion. Employ unigram language model to train the weight of eacEmploy unigram language model to train the weight of eac
h relation:h relation:
i i QNodeEndANodeStart __ and
))log((/)1log(
)(
))/((1)(
1
__
1
__
NCC
Relscore
NCCRelP
Ni
pathrelevantRelpathrelevantRel
i
Ni
pathrelevantRelpathrelevantReli
ii
ii
![Page 14: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/14.jpg)
1414
EvaluationsEvaluations
The evaluations aim to verify three The evaluations aim to verify three hypotheseshypotheses1.1. It’s effective to incorporate dependency relation based It’s effective to incorporate dependency relation based
query expansion technique to select high quality terms in query expansion technique to select high quality terms in a density based method.a density based method.
2.2. The use of dependency relation based query expansion The use of dependency relation based query expansion technique to extract relation paths further improves the technique to extract relation paths further improves the precisionprecision of passage ranking when integrated with fuzzy of passage ranking when integrated with fuzzy relation matching method.relation matching method.
3.3. As short queries with fewer key terms are likely to have As short queries with fewer key terms are likely to have word mismatch problemsword mismatch problems
![Page 15: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/15.jpg)
1515
Experiment SetupExperiment Setup
Training dataTraining data– 10,255 factoid QA pairs from TREC-8 and TREC-9 QA tas10,255 factoid QA pairs from TREC-8 and TREC-9 QA tas
ksks– The top 100 snippets from Google for each questionThe top 100 snippets from Google for each question– 8,892 relevant paths extracted8,892 relevant paths extracted
Testing dataTesting data– The AQUAINT news corpusThe AQUAINT news corpus– 324 factoid questions in TREC-12 QA task324 factoid questions in TREC-12 QA task
Excluding 30 questions with NIL answers and 59 questions Excluding 30 questions with NIL answers and 59 questions that do not have any ground truth passagesthat do not have any ground truth passages
5 Comparison systems5 Comparison systems– DBS, DBS+LCA, DBS+DRQET, RBS, RBS+DRQERDBS, DBS+LCA, DBS+DRQET, RBS, RBS+DRQER
![Page 16: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/16.jpg)
1616
Experiment Result-1Experiment Result-1
Table 1. Overall performance comparison. All improvements are Table 1. Overall performance comparison. All improvements are significant.significant.
![Page 17: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/17.jpg)
1717
Experiment Result-2Experiment Result-2
Fig. MRR before and after query expansion vs. number Fig. MRR before and after query expansion vs. number of non-trivial question terms. of non-trivial question terms.
![Page 18: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/18.jpg)
1818
Experiment Result-3Experiment Result-3
Testing dataset 2: 356 short queries in TREC-11 and TREC-Testing dataset 2: 356 short queries in TREC-11 and TREC-12 QA tasks12 QA tasks
The improvement is more significant than that in table 1.The improvement is more significant than that in table 1.
DBS+DRQET performs better than RBS.DBS+DRQET performs better than RBS.
![Page 19: Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006](https://reader036.vdocuments.mx/reader036/viewer/2022082821/5697bffc1a28abf838cc1d89/html5/thumbnails/19.jpg)
1919
Conclusion and Future Conclusion and Future WorkWork Two dependency relation based query expansion Two dependency relation based query expansion
techniques, DRQET and DRQER, are presented.techniques, DRQET and DRQER, are presented.
The experimental results show that RBS+DRQER The experimental results show that RBS+DRQER performs best among the 5 systems.performs best among the 5 systems.
We also studied the relationship between query We also studied the relationship between query lengths and improvements by query expansion.lengths and improvements by query expansion.
Directions for future work: (1) explore the use of Directions for future work: (1) explore the use of different models and their combinations for different models and their combinations for relation based query expansion; (2) conduct relation based query expansion; (2) conduct detailed analysis on the performance of detailed analysis on the performance of RBS+DRQER on different types of queries.RBS+DRQER on different types of queries.