jakub dutkiewicz , czesław ję[email protected]; {jakub.dutkiewicz,...
TRANSCRIPT
POZNAN Contribution to TREC PM 2018 (Notebook Paper)
Jakub Dutkiewicz2, Czesław Jędrzejek2 Artur Cieślewicz1
1 Department of Clinical Pharmacology, Poznan University of Medical Sciences, Poznan, Poland
2 IARiII, Poznan University of Technology, Poznan, Poland
[email protected]; {jakub.dutkiewicz, czeslaw.jedrzejek}@put.poznan.pl
Abstract. This work describes the medical information retrieval systems designed by the Poznan Consortium for TREC PM,
personalized medicine track, which was submitted to the TREC 2018. The baseline is the Terrier BB2. For Clinical Trials this work
uses the following options:
• Terrier BB2_simple_noprf: query without extension (sq_nprf)
• Terrier BB2_simple_w2v_prf: query extended with w2v and terrier prf (sqw2vprf)
• Terrier BB2 _variant_noprf: query + gene variant; without extension (vq_noprf)
• BB2_variant_w2v_prf: query + gene variant extended with w2v and terrier prf (vqw2vprf)
• SQ_results: experimental search of the sql database using keywords from query (sq)
Our best result is vq_noprf which is significantly better (approximately 0.08 for evaluated measures). With suitable word2method
the results are better compared to query extension using Mesh and disease taxonomy. For Medline abstracts the documents are
placed in the SQL database. For each document templates of diseases, genes, and applicability as Precision Medicine vs research
objects are matched. Patterns are saved using regular expressions. The pattern association with the document is stored in the SQL
database. For Medline abstracts our results are little worse than the TREC 2018 median.
1 Introduction
The TREC PM 2018 [TREC PM. 2018], is following three previous Clinical Decision Support Track contests and in particular,
the TREC PM 2017 [Roberts et al., 2017] . Its aim was the retrieval of biomedical articles relevant for answering generic clinical
questions about medical records. The 2018 track focuses on an important use case in clinical decision support: providing useful
precision medicine-related information to clinicians treating cancer patients. The 2018 track largely repeats the structure and
evaluation of the PM 2017 track. TREC 2018 Precision Medicine Track developed Relevance Judgment Guidelines [Relevance
Judgment, 2017], the more specific rules compared to TREC CDS to qualify an answer as definitely relevant.
1.1 Source and target documents and rules
There are two target document collections for the Precision Medicine track: scientific abstracts (a January 2017 snapshot of
PubMed abstracts plus from AACR and ASCO proceedings targeted toward cancer therapy) and clinical trials (an April 2017
snapshot of ClinicalTrials.gov). Although we submitted runs in both areas, our main concentration was clinical trials task. Topics’
description for clinical trials was structured information. Each topic has four primary fields: Disease, Gene (Variant), and
Demographic. For instance:
<topics task="2018 TREC Precision Medicine">
<topic number="1">
<disease>melanoma</disease>
<gene>BRAF (V600E)</gene>
<demographic>64-year-old male</demographic>
</topic>
This structured information allows for a precise definition of „Definitely Relevant” answer: a result should have Disease in {Exact,
More General, More Specific categories}, at least one Gene is Exact, and both Demographic and Other are in. For some topics the
gene field contains a specification of the gene action or indirect gene function (Table 1).
Table 1 Additional/alternative gene functions.
Gene function Topic number
amplification 7, 11, 14
loss of function 5, 17, 23, 24
truncation 15
(extensive) tumor infiltrating lymphocytes 21, 22
high tumor mutational burden 20
rearrangement 16
tumor cells with >50% membranous PD-L1 expression 18
tumor cells negative for PD-L1 expression 19
high serum LDH levels 25
The “(extensive) tumor infiltrating lymphocytes” description can be only attributed to genes, though it is not direct. E.g. the
correlation between high TIL levels and improved clinical outcome was most prominent in triple-negative breast cancer (TNBC).
<topic number="18">
<disease>melanoma</disease>
<gene>tumor cells with >50% membranous PD-L1 expression</gene>
<demographic>48-year-old female</demographic>
</topic>
<topic number="16">
<disease>melanoma</disease>
<gene>NTRK1 rearrangement</gene>
<demographic>60-year-old male</demographic>
</topic>
<topic number="20">
<disease>melanoma</disease>
<gene>high tumor mutational burden</gene>
<demographic>86-year-old female</demographic>
</topic>
<topic number="24">
<disease>melanoma</disease>
<gene>APC loss of function</gene>
<demographic>47-year-old male</demographic>
</topic>
<topic number="31">
<disease>head and neck squamous cell carcinoma</disease>
<gene>CDKN2A</gene>
<demographic>64-year-old male</demographic>
</topic>
Topics 1-6 and 8-13 were characterized by a gene variant. The average of the best TREC PM results as provided by organizers for
them is 0.80 compared with the same result for the rest of topics which is 0.70.
1.2 Experience
The Poznan Consortium team for TREC PM consists of contributors of two institutions: Department of Clinical Pharmacology,
Poznan University of Medical Sciences, and Information Systems and Technologies Division, IARiII, Poznan University of
Technology. We participated in TREC CDS 2016 track [Dutkiewicz et al, 2017], TREC PM 2017 track [Cieslewicz et al. , 2018a]
and in bioCADDIE 2016 [Cieslewicz et al. , 2018b]. In this work we use two type of embedding: designed for biomedical
applications [Chiu et al., 2016] and classical word embedding [Mikolov et al., 2013a] on the Pubmed abstracts corpus. We did not
preprocess the target Open Access Subset of PubMed Central (PMC).
2 Related work
2.1 Baseline system
Our experience shows that for biomedical systems Terrier 4.2 is among the best for baseline systems. The best performing depending
were 1) BB2 (Bernoulli-Einstein model with Bernoulli after-effect and normalization [Amati, van Rijsbergen, 2012], also denoted
as “DPH + Bo1) [Dutkiewicz, 2017], 2) LGD [Clinchant, Gaussier, 2010] (a log-logistic model for information retrieval)
[Cieslewicz, 2018b]. Another valuable feature implemented in Terrier is pseudo relevance feedback query expansion (PRF) – a
mechanism allowing for extraction of n most informative terms from m top ranked documents (ranking created in the first search
run) which are then added to the original query in the second retrieval rank. Terrier provides both parameter free (Bose-Einstein 1,
Bose-Einstein 2, Kullback-Leibler) and parameterized (Rocchio) models for query expansion [Rocchio, 1971]. Rocchio feedback
approach was developed using the Vector Space Model. The modified vectors are moved in a direction closer or farther away, from
the original query depending on whether documents are related or non-related.
2.2 Query expansion
Expanding queries by adding potentially relevant terms is a common practice in improving relevance in IR systems. There are
many methods of query expansion. Relevance feedback takes the documents on top of a ranking list and adds terms appearing in
these document to a new query. In this work we use the idea to add synonyms and other similar terms to query terms before the
pseudo-relevance feedback. This type of expansion can be divided into two categories. The first category involves the use of
ontologies or lexicons (relational knowledge). In biomedical area UMLS, MeSH [MeSH, 2018], SNOMED-CT, ICD-10, WordNet,
and Wikipedia are used [JaiswaL, 2016]. Generally, the result of lexicon type of expansion is positive. The second category
is word embedding (WE). This belongs to a class of distributional semantics, feature learning techniques in natural language
processing. One can draw experience on effect of using lexicons from other semantic task areas.
For natural language queries requiring an answer using multiple choice, relational learning using dictionaries encompassing the
whole corpus gives always better results than pure word embedding (word2vec). However, having synonym dictionaries (flat
structure) can significantly improve the word2vec results.
3 Retrieval methodology setup
Figure 3. Flowchart of the system developed for TREC 2018 PM track.
3.1 Methodology developed for Medline abstracts
1. Data indexing.
For the set of medical abstract documents, we perform regular expression based feature extraction. We look for
documents which relate to various genes, diseases, drugs which may relate to diseases within the query set or
gene variants, by applying the dedicated pattern to the document. Regex patterns relate to the features and
contain a variety of syntactic irregularities, such as:
- -delimiter between the main part of the gene/gene variant name and a number,
- -various letter case options,
- various gene/gene variant/disease/drug synonyms.
Exact value of a feature is calculated as number of the regular expression matches divided by the document
length (in words).
2. Query construction.
Queries are translated into SQL SELECT statements. The generated statements are:
a. Select documents which are classified as related to a given disease
b. Select documents which contain information about a given gene
c. Select documents which contain information about a given variant
d. Select a specific value of a feature for a given document
3. Information retrieval
We perform a single run "boolean" on the set. A document is taken into relevant document list only if it contains
at least one gene name named in the query and it contains one of the disease name synonyms. We assume that a
document, which relates to a disease is a Precision Medicine document by default. We rank these documents
which contain correct gene variant higher than documents which miss the gene variant and higher than documents
with incorrect gene variant. Further ranking is calculated upon the values of the related features - the higher the
combined sum of features is, the higher ranking score the document receives.
3.2 Methodology developed for Clinical Trials data
1. Data indexing.
The content of clinical trials documents was indexed as follows:
- tag DOCNO: the value of nct_id tag
- tag <TEXT>: the values of tags brief_title, official_title, brief_summary, detailed_description,
primary_outcome, secondary_outcome, condition, arm_group, condition_browse, intervention_browse,
keyword; additionally, inclusion criteria were extracted from criteria tag
2. Query construction.
Three types of queries were constructed:
1. simple query: find documents with any term from <disease> tag or any gene name from <gene> tag;
gene variants were not taken into consideration
2. variant query: find documents with any term from <disease> tag or any term from <gene> tag; gene
variants were taken into consideration
3. SQL query: browse data from SQL database to find records that have any term from <disease> tag and
any term from <gene> tag; gene variants were not taken into consideration
Queries 1) and 2 were also expanded in a following manner: terms from simple or variant query were expanded
using 2 vector space models (word2vec): first calculated by us on the basis of bioCADDIE corpus; second
calculated by [Chiu , et al., 2016] on the text corpus based on PubMed citations and full-text articles
As a result, five queries were constructed for each topic.
3. Information retrieval
4 terrier runs were carried out, using BB2 as ranking function:
1. BB2_simple_noprf: simple query was put as input for terrier
2. BB2_simple_w2v_prf: simple query expanded with word2vec was put as input for terrier; additional
expansion was carried out using terrier pseudo relevance feedback (PRF)
3. BB2_variant_noprf: variant query was put as input for terrier
4. BB2_variant_w2v_prf: variant query expanded with word2vec was put as input for terrier; additional
expansion was carried out using terrier PRF
The last run (SQ_results) was performer as follows:
5. data from <nct_id>, <brief_title>, <official_title>, <brief_summary>, <detailed_description>,
<keyword>, <condition_browse>, and <study_type> xml tags of each document was put into SQL
database; inclusion criteria were also extracted from <criteria> tag and were placed in the database in
“inclusion” field for each topic, an SQL query (described in “Query construction” section) was
executed obtain results were then ranked, according to the number of occurrences of disease, gene
name and gene variant terms in each record fields; documents describing international studies were
granted additional points to total score
All generated results were then checked according to the value of <demographic> tag of each topic. Resulting
documents that were describing trials recruiting patients with inappropriate age or gender were removed from
the result set.
4 Results
In this section we go through the outcome of every retrieval setup implemented by our group and applied to the competition data
sets. We compare our results to median and best of the PM submissions. Finally, we discuss the best application for each setup. For
the evaluation we show measures that were used by TREC PM evaluators for abstracts and clinical trials.
4.1 Results for Medline abstracts
In this category we submitted one run using Boolean search. Our average result is little worse than TREC PM median.
Figure 4.1 infNDCG results for Medline abstracts for individual topics and median averages (ours and the TREC PM 2018).
4.2 Results for Clinical Trials data
In this category we submitted 5 runs, all using BB2 as a baseline. In TREC PM 2017 we used the LGD option that was the best for
bioCADDIE 2016 challenge. For the purpose of this work, having qrels for the last year TRC PM we verified that BB2 method
beats the LGD one.
The results generated for Clinical trials data have shown that using Pseudo Relevance Feedback (PRF) to expand the query had
negative impact for almost all measures (see the Table 4.1). Terrier PRF was configured to expand queries with 10 terms from top
2 documents. Observed worsening of the results could be explained with the fact that PRF was carried out before documents
describing trials recruiting patients with inappropriate age or gender were removed from the result set. Another aspect worth
considering is that Clinical trials information retrieval required finding the documents describing clinical trials for which the patient
described in the query is eligible for the recruitment. PRF, by adding additional terms to the query, could improve the score of not
relevant documents.
The best results were obtained with word embedding using both vectors obtained by [Chiu et al., 2016] and by our classical
[Mikolov, et al., 2013] type of calculations with vector cosine measure = 0.875.
Green color indicates the results better than 0.5, red color the results worse than 0.5 than median results.
Table 4.1 Results for various options for the provided Clinical Trials data runs (intensity of these colors proportional to divergence
from the medians).
topic sq_nprf vq_noprf vqw2vprf sqw2vprf sq trec_best trec_median 1 0.613 0.806 0.626 0.6239 0.7373 0.8489 0.6516
2 0.7232 0.7913 0.7334 0.7295 0.7724 0.9054 0.7724
3 0.6989 0.7247 0.699 0.7119 0.6665 0.8424 0.6937
4 0.3416 0.3416 0.3416 0.3231 0.2157 0.4584 0.2583
5 0.859 0.9239 0.8797 0.8593 0.8885 0.9645 0.8084
6 0.7031 0.7531 0.774 0.6924 0.7488 0.8685 0.6651
7 0.8326 0.8565 0.8646 0.8323 0.7893 0.9652 0.7421
8 0.7013 0.7013 0.7013 0.5796 0.4353 1.0942 0.4944
9 0.1323 0.1323 0.1323 0.2658 0.1385 0.2658 0.1396
10 0.7195 0.7195 0.7195 0.6899 0.6886 0.9078 0.7135
11 0.8052 0.9154 0.8104 0.7709 0.8272 0.9291 0.7746
12 0.9055 0.9055 0.9055 0.8497 0.935 0.9627 0.8837
13 0.8697 0.8697 0.8697 0.848 0.8141 0.8941 0.7536
14 0.7874 0.7256 0.7266 0.7506 0.6935 0.8067 0.6667
15 0.0895 0.0893 0.0875 0.0411 0 0.2744 0.0518
16 0 0 0 0 0 0.5 0
17 0.2215 0.2617 0.259 0.1949 0.2962 0.608 0.2754
18 0.1249 0.0555 0 0 0.4544 0.6743 0.26
19 0.242 0.2635 0.021 0 0.4082 0.6679 0.2635
20 0 0 0 0 0 0.5027 0.0337
21 0.0193 0.6864 0 0.017 0.6035 0.8145 0.4292
22 0.0116 0.4929 0 0.0108 0.3634 0.5291 0.3016
23 0.4168 0.4363 0.4321 0.4326 0.3356 0.6817 0.417
24 0 0 0 0 0 0.6309 0
25 0.0781 0.0708 0.0708 0.0868 0 0.2934 0.0708
26 0.7522 0.7522 0.7517 0.743 0.7809 0.8929 0.7112
27 0.7732 0.7732 0.7601 0.8193 0.2841 0.9098 0.6276
28 0.4738 0.4738 0.4738 0.4774 0.4579 0.7743 0.5339
29 0.5055 0.5055 0.5021 0.5001 0.3899 0.7313 0.3151
30 0.8085 0.8085 0.8089 0.8373 0.6184 0.904 0.7763
31 0 0 0 0 0 0.469 0.0746
32 0.1631 0.1631 0.1504 0.4484 0.0826 0.9194 0.234
33 0.522 0.522 0.5204 0.5209 0.5029 0.7171 0.3435
34 0.4972 0.4972 0.5633 0.4693 0.4453 0.7263 0.3654
35 0.5798 0.5798 0.5798 0.3468 0 0.842 0.1505
36 0.0896 0.0896 0.0896 0.082 0.1232 0.7636 0.0887
37 0.0188 0.0188 0.0252 0 0.0503 0.6119 0.2384
38 0.5231 0.5231 0.5231 0.5558 0.553 0.6993 0.5231
39 0.6131 0.6131 0.2519 0.2034 0.6262 0.868 0.5486
40 0.7283 0.7283 0.7283 0.7201 0.7705 0.8546 0.6206
41 0.7082 0.7082 0.6903 0.8744 0 0.8744 0.389
42 0.363 0.363 0.363 0.363 0.363 0.7735 0.363
43 0.6545 0.6545 0.6546 0.6223 0.6793 0.8021 0.5773
44 0.0779 0.0779 0.0779 0.0779 0.0922 0.9987 0.4119
45 0.7108 0.7108 0.7095 0.6527 0.5548 1.1734 0.5996
46 0.6196 0.6196 0.6196 0.6139 0.5447 0.7885 0.4801
47 0.727 0.727 0.727 0.7496 0.5802 0.8101 0.55
48 0.4825 0.4825 0.4825 0.571 0.3536 0.6527 0.4727
49 0.1205 0.1205 0.15 0.147 0 0.6131 0
50 0.4342 0.4342 0.4381 0.456 0.5999 0.7341 0.3676
all 0.4568 0.4894 0.4459 0.4432 0.4253 0.755894 0.429668
Figures 4.2-4.4 present results for particular topics compared with the best and median TREC PM 2018 results.
=
Figure 4.2 infNDCG results for Clinical Trials for individual topics and median averages (ours and the TREC PM 2018).
Figure 4.3 R-prec_ct for Clinical Trials for individual topics and median averages (ours and the TREC PM 2018).
sq, 1, 0.5909sq, 2, 0.5556sq, 3, 0.5691
sq, 4, 0.0638
sq, 5, 0.6
sq, 6, 0.5378sq, 7, 0.5424
sq, 8, 0.4815
sq, 9, 0.3125
sq, 10, 0.72sq, 11, 0.7273sq, 12, 0.7083sq, 13, 0.6667
sq, 14, 0.6429
sq, 15, 0sq, 16, 0
sq, 17, 0.1538
sq, 18, 0.3333
sq, 19, 0.25
sq, 20, 0
sq, 21, 0.6212
sq, 22, 0.4186sq, 23, 0.4231
sq, 24, 0sq, 25, 0
sq, 26, 0.8108
sq, 27, 0.2
sq, 28, 0.1111
sq, 29, 0.3846
sq, 30, 0.5
sq, 31, 0
sq, 32, 0.1455
sq, 33, 0.3649sq, 34, 0.4
sq, 35, 0
sq, 36, 0.1081
sq, 37, 0.0154
sq, 38, 0.5455
sq, 39, 0.4706
sq, 40, 0.5255
sq, 41, 0
sq, 42, 0.2
sq, 43, 0.6
sq, 44, 0.0283
sq, 45, 0.3913
sq, 46, 0.5sq, 47, 0.4894
sq, 48, 0.25
sq, 49, 0
sq, 50, 0.45
sq, all, 0.3482
vqw2vprf, 1, 0.5545
vqw2vprf, 2, 0.6349
vqw2vprf, 3, 0.6911
vqw2vprf, 4, 0.3404
vqw2vprf, 5, 0.5917
vqw2vprf, 6, 0.6303
vqw2vprf, 7, 0.6525vqw2vprf, 8,
0.5926
vqw2vprf, 9, 0.375
vqw2vprf, 10, 0.84
vqw2vprf, 11, 0.7273
vqw2vprf, 12, 0.7917
vqw2vprf, 13, 0.8148
vqw2vprf, 14, 0.8214
vqw2vprf, 15, 0vqw2vprf, 16, 0
vqw2vprf, 17, 0.1538
vqw2vprf, 18, 0
vqw2vprf, 19, 0.0312
vqw2vprf, 20, 0vqw2vprf, 21, 0vqw2vprf, 22, 0
vqw2vprf, 23, 0.3846
vqw2vprf, 24, 0vqw2vprf, 25, 0
vqw2vprf, 26, 0.6757
vqw2vprf, 27, 0.6
vqw2vprf, 28, 0.4444vqw2vprf, 29,
0.3846
vqw2vprf, 30, 0.6818
vqw2vprf, 31, 0
vqw2vprf, 32, 0.1636
vqw2vprf, 33, 0.4595
vqw2vprf, 34, 0.4
vqw2vprf, 35, 0.5
vqw2vprf, 36, 0.0541vqw2vprf, 37,
0.0154
vqw2vprf, 38, 0.5455
vqw2vprf, 39, 0.2353
vqw2vprf, 40, 0.5255
vqw2vprf, 41, 0.4286
vqw2vprf, 42, 0.2
vqw2vprf, 43, 0.6
vqw2vprf, 44, 0.0189
vqw2vprf, 45, 0.5217
vqw2vprf, 46, 0.5
vqw2vprf, 47, 0.6596
vqw2vprf, 48, 0.25
vqw2vprf, 49, 0
vqw2vprf, 50, 0.5
vqw2vprf, all, 0.3798
sqw2vprf, 1, 0.5636
sqw2vprf, 2, 0.627
sqw2vprf, 3, 0.6911
sqw2vprf, 4, 0.3404
sqw2vprf, 5, 0.625
sqw2vprf, 6, 0.6387
sqw2vprf, 7, 0.678
sqw2vprf, 8, 0.5556
sqw2vprf, 9, 0.375
sqw2vprf, 10, 0.8sqw2vprf, 11,
0.7273sqw2vprf, 12, 0.75
sqw2vprf, 13, 0.8148
sqw2vprf, 14, 0.8571
sqw2vprf, 15, 0sqw2vprf, 16, 0
sqw2vprf, 17, 0.1154
sqw2vprf, 18, 0sqw2vprf, 19, 0sqw2vprf, 20, 0
sqw2vprf, 21, 0.0152sqw2vprf, 22, 0
sqw2vprf, 23, 0.3846
sqw2vprf, 24, 0sqw2vprf, 25, 0
sqw2vprf, 26, 0.5946
sqw2vprf, 27, 0.56
sqw2vprf, 28, 0.4444
sqw2vprf, 29, 0.4176
sqw2vprf, 30, 0.6818
sqw2vprf, 31, 0
sqw2vprf, 32, 0.2545
sqw2vprf, 33, 0.4459
sqw2vprf, 34, 0.2sqw2vprf, 35, 0.25
sqw2vprf, 36, 0.0811
sqw2vprf, 37, 0
sqw2vprf, 38, 0.5455
sqw2vprf, 39, 0.2353
sqw2vprf, 40, 0.5182
sqw2vprf, 41, 0.7857
sqw2vprf, 42, 0.2
sqw2vprf, 43, 0.4
sqw2vprf, 44, 0.0189
sqw2vprf, 45, 0.4783
sqw2vprf, 46, 0.5
sqw2vprf, 47, 0.6596
sqw2vprf, 48, 0.25
sqw2vprf, 49, 0
sqw2vprf, 50, 0.6
sqw2vprf, all, 0.3736
vq_noprf, 1, 0.6273
vq_noprf, 2, 0.6587
vq_noprf, 3, 0.6911
vq_noprf, 4, 0.3404
vq_noprf, 5, 0.6833
vq_noprf, 6, 0.605
vq_noprf, 7, 0.661vq_noprf, 8, 0.5926
vq_noprf, 9, 0.375
vq_noprf, 10, 0.84vq_noprf, 11,
0.8182vq_noprf, 12,
0.7917
vq_noprf, 13, 0.8148
vq_noprf, 14, 0.8214
vq_noprf, 15, 0vq_noprf, 16, 0
vq_noprf, 17, 0.1538
vq_noprf, 18, 0.0606
vq_noprf, 19, 0.0625
vq_noprf, 20, 0
vq_noprf, 21, 0.6212
vq_noprf, 22, 0.5349
vq_noprf, 23, 0.3846
vq_noprf, 24, 0vq_noprf, 25, 0
vq_noprf, 26, 0.6757
vq_noprf, 27, 0.6
vq_noprf, 28, 0.4444vq_noprf, 29,
0.3956
vq_noprf, 30, 0.6818
vq_noprf, 31, 0
vq_noprf, 32, 0.1636
vq_noprf, 33, 0.4595
vq_noprf, 34, 0.2
vq_noprf, 35, 0.5
vq_noprf, 36, 0.0541
vq_noprf, 37, 0.0308
vq_noprf, 38, 0.5455
vq_noprf, 39, 0.4118
vq_noprf, 40, 0.5255
vq_noprf, 41, 0.4286
vq_noprf, 42, 0.2
vq_noprf, 43, 0.6
vq_noprf, 44, 0.0189
vq_noprf, 45, 0.5217
vq_noprf, 46, 0.5
vq_noprf, 47, 0.6596
vq_noprf, 48, 0.25
vq_noprf, 49, 0
vq_noprf, 50, 0.5vq_noprf, all,
0.4101
sq_nprf, 1, 0.5545
sq_nprf, 2, 0.6349
sq_nprf, 3, 0.6911
sq_nprf, 4, 0.3404
sq_nprf, 5, 0.6sq_nprf, 6, 0.6303
sq_nprf, 7, 0.678
sq_nprf, 8, 0.5926
sq_nprf, 9, 0.375
sq_nprf, 10, 0.84sq_nprf, 11,
0.8182sq_nprf, 12,
0.7917
sq_nprf, 13, 0.8148
sq_nprf, 14, 0.8214
sq_nprf, 15, 0sq_nprf, 16, 0
sq_nprf, 17, 0.1538
sq_nprf, 18, 0.0606
sq_nprf, 19, 0.0625
sq_nprf, 20, 0
sq_nprf, 21, 0.0303
sq_nprf, 22, 0
sq_nprf, 23, 0.3846
sq_nprf, 24, 0sq_nprf, 25, 0
sq_nprf, 26, 0.6757
sq_nprf, 27, 0.6
sq_nprf, 28, 0.4444sq_nprf, 29,
0.3956
sq_nprf, 30, 0.6818
sq_nprf, 31, 0
sq_nprf, 32, 0.1636
sq_nprf, 33, 0.4595
sq_nprf, 34, 0.2
sq_nprf, 35, 0.5
sq_nprf, 36, 0.0541
sq_nprf, 37, 0.0308
sq_nprf, 38, 0.5455
sq_nprf, 39, 0.4118
sq_nprf, 40, 0.5255
sq_nprf, 41, 0.4286
sq_nprf, 42, 0.2
sq_nprf, 43, 0.6
sq_nprf, 44, 0.0189
sq_nprf, 45, 0.5217
sq_nprf, 46, 0.5
sq_nprf, 47, 0.6596
sq_nprf, 48, 0.25
sq_nprf, 49, 0
sq_nprf, 50, 0.5
sq_nprf, all, 0.3848
trec_best, 1, 0.6727
trec_best, 2, 0.6905
trec_best, 3, 0.6992
trec_best, 4, 0.3404
trec_best, 5, 0.6833trec_best, 6,
0.6387
trec_best, 7, 0.678
trec_best, 8, 0.6667
trec_best, 9, 0.375
trec_best, 10, 0.88
trec_best, 11, 0.8636
trec_best, 12, 0.875trec_best, 13,
0.8148
trec_best, 14, 0.8571
trec_best, 15, 0.2667
trec_best, 16, 0
trec_best, 17, 0.5385trec_best, 18,
0.4848trec_best, 19, 0.4375
trec_best, 20, 0.25
trec_best, 21, 0.6667
trec_best, 22, 0.5581
trec_best, 23, 0.5769
trec_best, 24, 0
trec_best, 25, 0.25
trec_best, 26, 0.8108
trec_best, 27, 0.72trec_best, 28, 0.6667
trec_best, 29, 0.4835
trec_best, 30, 0.7727
trec_best, 31, 0.3636
trec_best, 32, 0.4727
trec_best, 33, 0.5405
trec_best, 34, 0.6
trec_best, 35, 0.5
trec_best, 36, 0.5135
trec_best, 37, 0.5846
trec_best, 38, 0.7273
trec_best, 39, 0.6471
trec_best, 40, 0.5547
trec_best, 41, 0.7857
trec_best, 42, 0.6
trec_best, 43, 0.7trec_best, 44,
0.6038trec_best, 45, 0.5652
trec_best, 46, 0.6111
trec_best, 47, 0.6596
trec_best, 48, 0.4375
trec_best, 49, 0.5
trec_best, 50, 0.65
trec_best, all, 0.576696trec_median, 1,
0.5455
trec_median, 2, 0.5873
trec_median, 3, 0.6098
trec_median, 4, 0.2128
trec_median, 5, 0.575trec_median, 6,
0.5042
trec_median, 7, 0.5424
trec_median, 8, 0.4444
trec_median, 9, 0.25
trec_median, 10, 0.68trec_median, 11, 0.6364
trec_median, 12, 0.7083trec_median, 13,
0.6667trec_median, 14, 0.6071
trec_median, 15, 0
trec_median, 16, 0
trec_median, 17, 0.2308
trec_median, 18, 0.1515
trec_median, 19, 0.1562
trec_median, 20, 0
trec_median, 21, 0.4091
trec_median, 22, 0.2791
trec_median, 23, 0.3462
trec_median, 24, 0
trec_median, 25, 0
trec_median, 26, 0.5676
trec_median, 27, 0.48trec_median, 28, 0.4444
trec_median, 29, 0.2308
trec_median, 30, 0.6364
trec_median, 31, 0
trec_median, 32, 0.1636
trec_median, 33, 0.2703trec_median, 34,
0.2
trec_median, 35, 0
trec_median, 36, 0.0811
trec_median, 37, 0.1692
trec_median, 38, 0.4545
trec_median, 39, 0.2353
trec_median, 40, 0.3504
trec_median, 41, 0.1429
trec_median, 42, 0.2
trec_median, 43, 0.5
trec_median, 44, 0.2358
trec_median, 45, 0.3913
trec_median, 46, 0.4444
trec_median, 47, 0.4468
trec_median, 48, 0.25
trec_median, 49, 0
trec_median, 50, 0.3
trec_median, all, 0.326752
infNDCG_ct
sq vqw2vprf sqw2vprf vq_noprf sq_nprf trec_best trec_median
sq, 1, 0.5909sq, 2, 0.5556sq, 3, 0.5691
sq, 4, 0.0638
sq, 5, 0.6
sq, 6, 0.5378sq, 7, 0.5424
sq, 8, 0.4815
sq, 9, 0.3125
sq, 10, 0.72sq, 11, 0.7273sq, 12, 0.7083sq, 13, 0.6667
sq, 14, 0.6429
sq, 15, 0sq, 16, 0
sq, 17, 0.1538
sq, 18, 0.3333
sq, 19, 0.25
sq, 20, 0
sq, 21, 0.6212
sq, 22, 0.4186sq, 23, 0.4231
sq, 24, 0sq, 25, 0
sq, 26, 0.8108
sq, 27, 0.2
sq, 28, 0.1111
sq, 29, 0.3846
sq, 30, 0.5
sq, 31, 0
sq, 32, 0.1455
sq, 33, 0.3649sq, 34, 0.4
sq, 35, 0
sq, 36, 0.1081
sq, 37, 0.0154
sq, 38, 0.5455
sq, 39, 0.4706sq, 40, 0.5255
sq, 41, 0
sq, 42, 0.2
sq, 43, 0.6
sq, 44, 0.0283
sq, 45, 0.3913
sq, 46, 0.5sq, 47, 0.4894
sq, 48, 0.25
sq, 49, 0
sq, 50, 0.45
sq, all, 0.3482
vqw2vprf, 1, 0.5545
vqw2vprf, 2, 0.6349
vqw2vprf, 3, 0.6911
vqw2vprf, 4, 0.3404
vqw2vprf, 5, 0.5917
vqw2vprf, 6, 0.6303
vqw2vprf, 7, 0.6525vqw2vprf, 8,
0.5926
vqw2vprf, 9, 0.375
vqw2vprf, 10, 0.84vqw2vprf, 11,
0.7273
vqw2vprf, 12, 0.7917
vqw2vprf, 13, 0.8148
vqw2vprf, 14, 0.8214
vqw2vprf, 15, 0vqw2vprf, 16, 0
vqw2vprf, 17, 0.1538
vqw2vprf, 18, 0
vqw2vprf, 19, 0.0312
vqw2vprf, 20, 0vqw2vprf, 21, 0vqw2vprf, 22, 0
vqw2vprf, 23, 0.3846
vqw2vprf, 24, 0vqw2vprf, 25, 0
vqw2vprf, 26, 0.6757
vqw2vprf, 27, 0.6
vqw2vprf, 28, 0.4444vqw2vprf, 29,
0.3846
vqw2vprf, 30, 0.6818
vqw2vprf, 31, 0
vqw2vprf, 32, 0.1636
vqw2vprf, 33, 0.4595
vqw2vprf, 34, 0.4
vqw2vprf, 35, 0.5
vqw2vprf, 36, 0.0541vqw2vprf, 37,
0.0154
vqw2vprf, 38, 0.5455
vqw2vprf, 39, 0.2353
vqw2vprf, 40, 0.5255
vqw2vprf, 41, 0.4286
vqw2vprf, 42, 0.2
vqw2vprf, 43, 0.6
vqw2vprf, 44, 0.0189
vqw2vprf, 45, 0.5217vqw2vprf, 46, 0.5
vqw2vprf, 47, 0.6596
vqw2vprf, 48, 0.25
vqw2vprf, 49, 0
vqw2vprf, 50, 0.5
vqw2vprf, all, 0.3798
sqw2vprf, 1, 0.5636
sqw2vprf, 2, 0.627
sqw2vprf, 3, 0.6911
sqw2vprf, 4, 0.3404
sqw2vprf, 5, 0.625
sqw2vprf, 6, 0.6387
sqw2vprf, 7, 0.678
sqw2vprf, 8, 0.5556
sqw2vprf, 9, 0.375
sqw2vprf, 10, 0.8sqw2vprf, 11, 0.7273sqw2vprf, 12, 0.75
sqw2vprf, 13, 0.8148
sqw2vprf, 14, 0.8571
sqw2vprf, 15, 0sqw2vprf, 16, 0
sqw2vprf, 17, 0.1154
sqw2vprf, 18, 0sqw2vprf, 19, 0sqw2vprf, 20, 0
sqw2vprf, 21, 0.0152sqw2vprf, 22, 0
sqw2vprf, 23, 0.3846
sqw2vprf, 24, 0sqw2vprf, 25, 0
sqw2vprf, 26, 0.5946
sqw2vprf, 27, 0.56
sqw2vprf, 28, 0.4444
sqw2vprf, 29, 0.4176
sqw2vprf, 30, 0.6818
sqw2vprf, 31, 0
sqw2vprf, 32, 0.2545
sqw2vprf, 33, 0.4459
sqw2vprf, 34, 0.2sqw2vprf, 35, 0.25
sqw2vprf, 36, 0.0811
sqw2vprf, 37, 0
sqw2vprf, 38, 0.5455
sqw2vprf, 39, 0.2353
sqw2vprf, 40, 0.5182
sqw2vprf, 41, 0.7857
sqw2vprf, 42, 0.2
sqw2vprf, 43, 0.4
sqw2vprf, 44, 0.0189
sqw2vprf, 45, 0.4783sqw2vprf, 46, 0.5
sqw2vprf, 47, 0.6596
sqw2vprf, 48, 0.25
sqw2vprf, 49, 0
sqw2vprf, 50, 0.6
sqw2vprf, all, 0.3736
vq_noprf, 1, 0.6273
vq_noprf, 2, 0.6587
vq_noprf, 3, 0.6911
vq_noprf, 4, 0.3404
vq_noprf, 5, 0.6833
vq_noprf, 6, 0.605vq_noprf, 7, 0.661vq_noprf, 8,
0.5926
vq_noprf, 9, 0.375
vq_noprf, 10, 0.84vq_noprf, 11,
0.8182vq_noprf, 12,
0.7917
vq_noprf, 13, 0.8148
vq_noprf, 14, 0.8214
vq_noprf, 15, 0vq_noprf, 16, 0
vq_noprf, 17, 0.1538
vq_noprf, 18, 0.0606
vq_noprf, 19, 0.0625
vq_noprf, 20, 0
vq_noprf, 21, 0.6212
vq_noprf, 22, 0.5349
vq_noprf, 23, 0.3846
vq_noprf, 24, 0vq_noprf, 25, 0
vq_noprf, 26, 0.6757
vq_noprf, 27, 0.6
vq_noprf, 28, 0.4444vq_noprf, 29,
0.3956
vq_noprf, 30, 0.6818
vq_noprf, 31, 0
vq_noprf, 32, 0.1636
vq_noprf, 33, 0.4595
vq_noprf, 34, 0.2
vq_noprf, 35, 0.5
vq_noprf, 36, 0.0541
vq_noprf, 37, 0.0308
vq_noprf, 38, 0.5455
vq_noprf, 39, 0.4118
vq_noprf, 40, 0.5255
vq_noprf, 41, 0.4286
vq_noprf, 42, 0.2
vq_noprf, 43, 0.6
vq_noprf, 44, 0.0189
vq_noprf, 45, 0.5217vq_noprf, 46, 0.5
vq_noprf, 47, 0.6596
vq_noprf, 48, 0.25
vq_noprf, 49, 0
vq_noprf, 50, 0.5vq_noprf, all,
0.4101
sq_nprf, 1, 0.5545
sq_nprf, 2, 0.6349sq_nprf, 3, 0.6911
sq_nprf, 4, 0.3404
sq_nprf, 5, 0.6sq_nprf, 6, 0.6303
sq_nprf, 7, 0.678
sq_nprf, 8, 0.5926
sq_nprf, 9, 0.375
sq_nprf, 10, 0.84sq_nprf, 11,
0.8182sq_nprf, 12,
0.7917
sq_nprf, 13, 0.8148
sq_nprf, 14, 0.8214
sq_nprf, 15, 0sq_nprf, 16, 0
sq_nprf, 17, 0.1538
sq_nprf, 18, 0.0606
sq_nprf, 19, 0.0625
sq_nprf, 20, 0
sq_nprf, 21, 0.0303
sq_nprf, 22, 0
sq_nprf, 23, 0.3846
sq_nprf, 24, 0sq_nprf, 25, 0
sq_nprf, 26, 0.6757
sq_nprf, 27, 0.6
sq_nprf, 28, 0.4444sq_nprf, 29,
0.3956
sq_nprf, 30, 0.6818
sq_nprf, 31, 0
sq_nprf, 32, 0.1636
sq_nprf, 33, 0.4595
sq_nprf, 34, 0.2
sq_nprf, 35, 0.5
sq_nprf, 36, 0.0541
sq_nprf, 37, 0.0308
sq_nprf, 38, 0.5455
sq_nprf, 39, 0.4118
sq_nprf, 40, 0.5255
sq_nprf, 41, 0.4286
sq_nprf, 42, 0.2
sq_nprf, 43, 0.6
sq_nprf, 44, 0.0189
sq_nprf, 45, 0.5217sq_nprf, 46, 0.5
sq_nprf, 47, 0.6596
sq_nprf, 48, 0.25
sq_nprf, 49, 0
sq_nprf, 50, 0.5
sq_nprf, all, 0.3848
trec_best, 1, 0.6727
trec_best, 2, 0.6905
trec_best, 3, 0.6992
trec_best, 4, 0.3404
trec_best, 5, 0.6833trec_best, 6,
0.6387trec_best, 7, 0.678
trec_best, 8, 0.6667
trec_best, 9, 0.375
trec_best, 10, 0.88trec_best, 11,
0.8636
trec_best, 12, 0.875trec_best, 13,
0.8148
trec_best, 14, 0.8571
trec_best, 15, 0.2667
trec_best, 16, 0
trec_best, 17, 0.5385trec_best, 18,
0.4848trec_best, 19, 0.4375
trec_best, 20, 0.25
trec_best, 21, 0.6667
trec_best, 22, 0.5581
trec_best, 23, 0.5769
trec_best, 24, 0
trec_best, 25, 0.25
trec_best, 26, 0.8108
trec_best, 27, 0.72trec_best, 28, 0.6667
trec_best, 29, 0.4835
trec_best, 30, 0.7727
trec_best, 31, 0.3636
trec_best, 32, 0.4727
trec_best, 33, 0.5405
trec_best, 34, 0.6
trec_best, 35, 0.5
trec_best, 36, 0.5135
trec_best, 37, 0.5846
trec_best, 38, 0.7273
trec_best, 39, 0.6471
trec_best, 40, 0.5547
trec_best, 41, 0.7857
trec_best, 42, 0.6
trec_best, 43, 0.7trec_best, 44,
0.6038trec_best, 45, 0.5652
trec_best, 46, 0.6111
trec_best, 47, 0.6596
trec_best, 48, 0.4375
trec_best, 49, 0.5
trec_best, 50, 0.65trec_best, all, 0.576696
trec_median, 1, 0.5455
trec_median, 2, 0.5873
trec_median, 3, 0.6098
trec_median, 4, 0.2128
trec_median, 5, 0.575trec_median, 6,
0.5042
trec_median, 7, 0.5424
trec_median, 8, 0.4444
trec_median, 9, 0.25
trec_median, 10, 0.68trec_median, 11, 0.6364
trec_median, 12, 0.7083trec_median, 13,
0.6667trec_median, 14, 0.6071
trec_median, 15, 0
trec_median, 16, 0
trec_median, 17, 0.2308trec_median, 18,
0.1515
trec_median, 19, 0.1562
trec_median, 20, 0
trec_median, 21, 0.4091
trec_median, 22, 0.2791
trec_median, 23, 0.3462
trec_median, 24, 0
trec_median, 25, 0
trec_median, 26, 0.5676
trec_median, 27, 0.48trec_median, 28, 0.4444
trec_median, 29, 0.2308
trec_median, 30, 0.6364
trec_median, 31, 0
trec_median, 32, 0.1636
trec_median, 33, 0.2703trec_median, 34,
0.2
trec_median, 35, 0
trec_median, 36, 0.0811
trec_median, 37, 0.1692
trec_median, 38, 0.4545
trec_median, 39, 0.2353
trec_median, 40, 0.3504
trec_median, 41, 0.1429
trec_median, 42, 0.2
trec_median, 43, 0.5
trec_median, 44, 0.2358
trec_median, 45, 0.3913
trec_median, 46, 0.4444
trec_median, 47, 0.4468
trec_median, 48, 0.25
trec_median, 49, 0
trec_median, 50, 0.3
trec_median, all, 0.326752
R-prec_ct
sq vqw2vprf sqw2vprf vq_noprf sq_nprf trec_best trec_median
Figure 4.4 P-10_ct for Clinical Trials for individual topics and median averages (ours and the TREC PM 2018).
4 Conclusions and future work
For Medline abstracts our results are little worse than the TREC 2018 median. The method is different from classical
methods, and similar to Boolean searches. The biggest deficiency of our approach for Medline abstracts is lack of
distinction between scientific papers and personalized treatment papers. Presumably this feature explains very good
results of [Goodwin et al., 2017] and [Mahmood et al., 2017] in the TREC PM 2017 track. Another reason could be
extraction of relations used be these two teams [Peng et al., 2016]. For Clinical Trials our best result is vq_noprf which
is significantly better (approximately 0.08 for evaluated measures). With a suitable word2vec method the results are
better compared to query extension using Mesh and disease taxonomies.
References
[Amati, van Rijsbergen, 2012] Amati,G., van Rijsbergen,C. J., (2002) Probabilistic models of information retrieval based on
measuring the divergence from randomness, ACM Trans. Inf. Syst. 20(4): 357-389
[TREC PM. 2018], [ http://www.trec-cds.org/2018.html
[TREC PM, Relevance guidelines, http://www.trec-cds.org/relevance_guidelines.pdf
[Roberts et al, 2017] Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees William R. Hersh and Steven Bedrick Alexander J.
Lazar Shubham Pant, Overview of the TREC 2017 Precision Medicine Track
https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf
[Chiu , et al., 2016] Chiu,B., Crichton,G., Korhonen,A. et al. (2016) How to train good word embeddings for biomedical NLP. In:
Proceedings of the 5th Workshop on Biomedical Natural Language Processing. Berlin, Germany.
http://www.aclweb.org/anthology/W16-2922.
[Cieslewicz et al, 2017a] Artur Cieslewicz, Jakub Dutkiewicz, Czeslaw Jedrzejek: Baseline and extensions approach to information
retrieval of complex medical data: Poznan's approach to the bioCADDIE 2016. Database 2018: bax103 (2018).
[Cieslewicz et al, 2017b] Artur Cieslewicz, Jakub Dutkiewicz, Czeslaw Jedrzejek:, POZNAN Contribution to TREC PM 2017,
https://trec.nist.gov/pubs/trec26/papers/POZNAN-PM.pdf
[Clinchant, Gaussier,2010] Clinchant,S., Gaussier,E., (2010) Information-based models for ad hoc IR. In SIGIR’10, 234-241.
[Dutkiewicz et al, 2017] Jakub Dutkiewicz, Czesław Jędrzejek, Michał Frąckowiak and Paweł Werda, PUT Contribution to TREC
CDS 2016, https://trec.nist.gov/pubs/trec25/papers/IAII_PUT-CL.pdf
Terrier 5.0 IR Platform www.terrier.org http://terrier.org/docs/v5.0/ accessed July 2, 2018
[Goodwin et al., 2017] Travis R. Goodwin, Michael A. Skinner and Sanda M. Harabagiu UTD HLTRI at TREC 2017: Precision
Medicine Track, https://trec.nist.gov/pubs/trec26/papers/UTDHLTRI-PM.pdf
[Mahmood et al., 2017] A. S. M. Ashique Mahmood, Gang Li , Shruti Rao , Peter McGarvey, Cathy Wu, Subha Madhavan2, K.
Vijay-Shanker UD_GU_BioTM at TREC 2017: Precision Medicine Track
https://trec.nist.gov/pubs/trec26/papers/UD_GU_BioTM-PM.pdf
[Jaiswa et al., 2016] Jaiswal,P., Hoehndorf,R., Cecilia,N. et al. (2016) Proceedings of the Joint International Conference on
Biological Ontology and BioCreative, Corvallis, Oregon, United States. CEUR Workshop
Proceedings 1747, CEUR-WS.org 201.
[Mikolov, et al., 2013] Mikolov, T., Sutskever, I., Chen, K., Corrado, Corrado, G.S., and Dean, (2013b), J. Efficient
Estimation of Word Representations in Vector Space. CoRR abs/1301.3781
[MeSH, 2018], MeSH database: https://www.nlm.nih.gov/mesh/download_mesh.html, Access: July 2, 2018
[Peng et al., 2016] Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, Wen-tau Yih:
Cross-Sentence N-ary Relation Extraction with Graph LSTMs. TACL 5: 101-115 (2017)
Serie1, 1, 0.6
Serie1, 2, 1
Serie1, 3, 0.9
Serie1, 4, 0.5
Serie1, 5, 1Serie1, 6, 1
Serie1, 7, 0.5
Serie1, 8, 0.8
Serie1, 9, 0.5
Serie1, 10, 1
Serie1, 11, 0.9
Serie1, 12, 1Serie1, 13, 1
Serie1, 14, 0.9
Serie1, 15, 0Serie1, 16, 0
Serie1, 17, 0.4
Serie1, 18, 0.1Serie1, 19, 0.1
Serie1, 20, 0Serie1, 21, 0Serie1, 22, 0
Serie1, 23, 0.9
Serie1, 24, 0Serie1, 25, 0
Serie1, 26, 0.9
Serie1, 27, 0.8
Serie1, 28, 0.4
Serie1, 29, 0.9
Serie1, 30, 0.8
Serie1, 31, 0
Serie1, 32, 0.3
Serie1, 33, 1
Serie1, 34, 0.1
Serie1, 35, 0.2Serie1, 36, 0.2
Serie1, 37, 0
Serie1, 38, 0.6Serie1, 39, 0.6
Serie1, 40, 1
Serie1, 41, 0.6
Serie1, 42, 0.1
Serie1, 43, 0.6
Serie1, 44, 0.2
Serie1, 45, 0.7Serie1, 46, 0.7
Serie1, 47, 0.9
Serie1, 48, 0.4
Serie1, 49, 0
Serie1, 50, 0.5Serie1, all, 0.512
Serie2, 1, 0.9
Serie2, 2, 1
Serie2, 3, 0.8
Serie2, 4, 0.5
Serie2, 5, 1
Serie2, 6, 0.9
Serie2, 7, 0.7
Serie2, 8, 0.8
Serie2, 9, 0.5
Serie2, 10, 1
Serie2, 11, 0.9
Serie2, 12, 1Serie2, 13, 1
Serie2, 14, 0.9
Serie2, 15, 0Serie2, 16, 0
Serie2, 17, 0.4
Serie2, 18, 0.1
Serie2, 19, 0.2
Serie2, 20, 0
Serie2, 21, 1
Serie2, 22, 0.9
Serie2, 23, 0.8
Serie2, 24, 0
Serie2, 25, 0.1
Serie2, 26, 0.9
Serie2, 27, 0.8
Serie2, 28, 0.4
Serie2, 29, 0.9
Serie2, 30, 0.8
Serie2, 31, 0
Serie2, 32, 0.3
Serie2, 33, 1
Serie2, 34, 0.1
Serie2, 35, 0.2Serie2, 36, 0.2
Serie2, 37, 0
Serie2, 38, 0.6Serie2, 39, 0.6
Serie2, 40, 1
Serie2, 41, 0.6
Serie2, 42, 0.1
Serie2, 43, 0.6
Serie2, 44, 0.2
Serie2, 45, 0.7Serie2, 46, 0.7
Serie2, 47, 0.9
Serie2, 48, 0.4
Serie2, 49, 0
Serie2, 50, 0.5Serie2, all, 0.558
Serie3, 1, 0.8
Serie3, 2, 1
Serie3, 3, 0.9
Serie3, 4, 0.4
Serie3, 5, 1Serie3, 6, 1
Serie3, 7, 0.7Serie3, 8, 0.7
Serie3, 9, 0.4
Serie3, 10, 0.8Serie3, 11, 0.8
Serie3, 12, 0.7
Serie3, 13, 0.9Serie3, 14, 0.9
Serie3, 15, 0Serie3, 16, 0
Serie3, 17, 0.2
Serie3, 18, 0Serie3, 19, 0Serie3, 20, 0Serie3, 21, 0Serie3, 22, 0
Serie3, 23, 0.9
Serie3, 24, 0Serie3, 25, 0
Serie3, 26, 0.7
Serie3, 27, 0.8
Serie3, 28, 0.4
Serie3, 29, 0.9
Serie3, 30, 0.8
Serie3, 31, 0
Serie3, 32, 0.4
Serie3, 33, 1
Serie3, 34, 0.1Serie3, 35, 0.1Serie3, 36, 0.1
Serie3, 37, 0
Serie3, 38, 0.6
Serie3, 39, 0.3
Serie3, 40, 1Serie3, 41, 1
Serie3, 42, 0.1
Serie3, 43, 0.4
Serie3, 44, 0.2
Serie3, 45, 0.6
Serie3, 46, 0.7
Serie3, 47, 0.9
Serie3, 48, 0.4
Serie3, 49, 0
Serie3, 50, 0.5Serie3, all, 0.482
Serie4, 1, 0.6
Serie4, 2, 1
Serie4, 3, 0.9
Serie4, 4, 0.5
Serie4, 5, 1Serie4, 6, 1
Serie4, 7, 0.7
Serie4, 8, 0.8
Serie4, 9, 0.5
Serie4, 10, 1
Serie4, 11, 0.9
Serie4, 12, 1Serie4, 13, 1
Serie4, 14, 0.9
Serie4, 15, 0Serie4, 16, 0
Serie4, 17, 0.3
Serie4, 18, 0Serie4, 19, 0Serie4, 20, 0Serie4, 21, 0Serie4, 22, 0
Serie4, 23, 0.8
Serie4, 24, 0
Serie4, 25, 0.1
Serie4, 26, 0.9
Serie4, 27, 0.8
Serie4, 28, 0.4
Serie4, 29, 0.9
Serie4, 30, 0.8
Serie4, 31, 0
Serie4, 32, 0.2
Serie4, 33, 1
Serie4, 34, 0.2Serie4, 35, 0.2Serie4, 36, 0.2
Serie4, 37, 0.1
Serie4, 38, 0.6
Serie4, 39, 0.4
Serie4, 40, 1
Serie4, 41, 0.5
Serie4, 42, 0.1
Serie4, 43, 0.6
Serie4, 44, 0.2
Serie4, 45, 0.7Serie4, 46, 0.7
Serie4, 47, 0.9
Serie4, 48, 0.4
Serie4, 49, 0
Serie4, 50, 0.6Serie4, all, 0.508
Serie5, 1, 0.8
Serie5, 2, 1
Serie5, 3, 0.9
Serie5, 4, 0.1
Serie5, 5, 0.9
Serie5, 6, 0.8
Serie5, 7, 0.7
Serie5, 8, 0.4
Serie5, 10, 0.8Serie5, 11, 0.8Serie5, 12, 0.8
Serie5, 13, 0.7
Serie5, 14, 0.8
Serie5, 15, 0Serie5, 16, 0
Serie5, 17, 0.4Serie5, 18, 0.4
Serie5, 19, 0.3
Serie5, 20, 0
Serie5, 21, 1
Serie5, 22, 0.8
Serie5, 23, 0.7
Serie5, 24, 0Serie5, 25, 0
Serie5, 26, 0.9
Serie5, 27, 0.3
Serie5, 28, 0.1
Serie5, 29, 0.9
Serie5, 30, 0.8
Serie5, 31, 0Serie5, 32, 0
Serie5, 33, 0.6
Serie5, 34, 0.2
Serie5, 35, 0
Serie5, 36, 0.3
Serie5, 37, 0.1
Serie5, 38, 0.6Serie5, 39, 0.6
Serie5, 40, 1
Serie5, 41, 0
Serie5, 42, 0.1
Serie5, 43, 0.6
Serie5, 44, 0.3
Serie5, 45, 0.7
Serie5, 46, 0.8Serie5, 47, 0.8
Serie5, 48, 0.4
Serie5, 49, 0
Serie5, 50, 0.5Serie5, all, 0.484
Serie6, 1, 1Serie6, 2, 1Serie6, 3, 1
Serie6, 4, 0.7
Serie6, 5, 1Serie6, 6, 1Serie6, 7, 1
Serie6, 8, 0.8
Serie6, 9, 0.5
Serie6, 10, 1Serie6, 11, 1Serie6, 12, 1Serie6, 13, 1Serie6, 14, 1
Serie6, 15, 0.4
Serie6, 16, 0.1
Serie6, 17, 0.9
Serie6, 18, 0.6Serie6, 19, 0.6
Serie6, 20, 0.2
Serie6, 21, 1
Serie6, 22, 0.9Serie6, 23, 0.9
Serie6, 24, 0.1Serie6, 25, 0.1
Serie6, 26, 1Serie6, 27, 1
Serie6, 28, 0.6
Serie6, 29, 1Serie6, 30, 1
Serie6, 31, 0.4
Serie6, 32, 1Serie6, 33, 1
Serie6, 34, 0.3Serie6, 35, 0.3
Serie6, 36, 0.9
Serie6, 37, 1
Serie6, 38, 0.8Serie6, 39, 0.8
Serie6, 40, 1Serie6, 41, 1
Serie6, 42, 0.3
Serie6, 43, 0.7
Serie6, 44, 1
Serie6, 45, 0.8
Serie6, 46, 0.9
Serie6, 47, 1
Serie6, 48, 0.6
Serie6, 49, 0.1
Serie6, 50, 0.8Serie6, all, 0.762
Serie7, 1, 0.8
Serie7, 2, 1
Serie7, 3, 0.9
Serie7, 4, 0.3
Serie7, 5, 0.9
Serie7, 6, 0.8
Serie7, 7, 0.7
Serie7, 8, 0.6
Serie7, 9, 0.3
Serie7, 10, 0.9
Serie7, 11, 0.8
Serie7, 12, 0.9Serie7, 13, 0.9
Serie7, 14, 0.8
Serie7, 15, 0Serie7, 16, 0
Serie7, 17, 0.3
Serie7, 18, 0.2Serie7, 19, 0.2
Serie7, 20, 0
Serie7, 21, 0.7
Serie7, 22, 0.4
Serie7, 23, 0.5
Serie7, 24, 0Serie7, 25, 0
Serie7, 26, 0.8
Serie7, 27, 0.7
Serie7, 28, 0.4
Serie7, 29, 0.7Serie7, 30, 0.7
Serie7, 31, 0
Serie7, 32, 0.3
Serie7, 33, 0.6
Serie7, 34, 0.1Serie7, 35, 0.1Serie7, 36, 0.1
Serie7, 37, 0.4
Serie7, 38, 0.5
Serie7, 39, 0.4
Serie7, 40, 0.9
Serie7, 41, 0.2
Serie7, 42, 0.1
Serie7, 43, 0.5Serie7, 44, 0.5
Serie7, 45, 0.6Serie7, 46, 0.6Serie7, 47, 0.6
Serie7, 48, 0.3
Serie7, 49, 0
Serie7, 50, 0.4Serie7, all, 0.468
P10_ct
Serie1 Serie2 Serie3 Serie4 Serie5 Serie6 Serie7
[Rocchio, 1971] Rocchio, J. (1971). Relevance feedback in information retrieval. In G. Salton (Ed.), The SMART retrieval
system—Experiments in automatic document processing (pp. 313–323). New York City, USA: Prentice-Hall.
Quality Assurance Copyright, 2002 © Jerzy R. Nawrocki [email protected] Quality Management Auxiliary
Project Planning Copyright, 2002 © Jerzy R. Nawrocki [email protected] Quality Management Auxilliary