main mono and bilingual tasks: track organisation and results analysis

CLEF 2007 WorkshopCLEF 2007 WorkshopBudapest, Hungary, 19–21 September 2007 Budapest, Hungary, 19–21 September 2007

Nicola FerroUniversity of Padua

Italy

[email protected]

Carol PetersISTI-CNR, Area di Ricerca Pisa

Italy

[email protected]

Giorgio M. Di Nunzio

University of PaduaItaly

[email protected]

Main Mono and Bilingual Tasks: Main Mono and Bilingual Tasks: Track Organisation and Results Analysis Track Organisation and Results Analysis

CLEF 2007CLEF 2007Budapest, Hungary, 19–21 September 2007 Budapest, Hungary, 19–21 September 2007

G.M. Di Nunzio, N. Ferro, and C. PetersG.M. Di Nunzio, N. Ferro, and C. Peters

OutlineOutline

22


G.M. Di Nunzio, N. Ferro, and C. PetersG.M. Di Nunzio, N. Ferro, and C. Peters 33



Information Hierarchy Information Hierarchy

experimental collectionsexperimental collections and the and the experimentsexperiments are are datadata, since they are the raw, , since they are the raw, basic elements needed for any further investigationbasic elements needed for any further investigation

performance measurementsperformance measurements are are informationinformation, since they are the result of , since they are the result of computations and processing on the data,computations and processing on the data,

descriptive statisticsdescriptive statistics and the and the hypothesis testshypothesis tests are are knowledgeknowledge, since they are a , since they are a further elaboration of the information carried by the performance measurementsfurther elaboration of the information carried by the performance measurements

theories, models, algorithms, and techniquestheories, models, algorithms, and techniques are are wisdomwisdom, since they provide , since they provide interpretation, explanation, and formalization of the content of the previous levels.interpretation, explanation, and formalization of the content of the previous levels.

Data Experime

nts and

Experime

ntal Col

lections

Information

Knowledge

Wisdom

Measures

Statisti

cs

Papers



Approach to the Evaluation (1/2)Approach to the Evaluation (1/2)

Introduce a Introduce a conceptual modelconceptual model it makes clear what are the it makes clear what are the entitiesentities entailed by the information entailed by the information

space of an evaluation campaign, their space of an evaluation campaign, their featuresfeatures, and their , and their relationshipsrelationships

logical modelslogical models can be derived from it to can be derived from it to managemanage and and preservepreserve the experimental datathe experimental data

commonly agreed commonly agreed data formatsdata formats for for exchanging informationexchanging information can be derived from itcan be derived from it

Develop common Develop common metadata formatsmetadata formats they provide meaning to the data, and thereby enable their they provide meaning to the data, and thereby enable their

sharingsharing and and re-usere-use they allow to keep track of the they allow to keep track of the lineagelineage of the managed of the managed

informationinformation

Adopt a Adopt a unique identificationunique identification mechanism mechanism it allows for explicit it allows for explicit citation citation andand easy access easy access to the scientific to the scientific

data and it supports the data and it supports the enrichementenrichement of the scientific data of the scientific data



Approach to the Evaluation (2/2)Approach to the Evaluation (2/2) Provide Provide common tools for statistical analysescommon tools for statistical analyses

they allow for judging whether measured differences between retrieval methods they allow for judging whether measured differences between retrieval methods can be considered statistically significant can be considered statistically significant

a uniform way of performing statistical analyses on experiments make the a uniform way of performing statistical analyses on experiments make the analysis and assessment of the experiments comparable tooanalysis and assessment of the experiments comparable too

Design and develop a Design and develop a Digital Library System (DLS) for IR scientific Digital Library System (DLS) for IR scientific datadata it is well suited for managing and making accessible the scientific data and the it is well suited for managing and making accessible the scientific data and the

experiments produced during the course of an evaluation campaignexperiments produced during the course of an evaluation campaign it also provides tools for analyzing, comparing, and citing the scientific data of it also provides tools for analyzing, comparing, and citing the scientific data of

an evaluation campaign, as well as curating, preserving, annotating, enriching, an evaluation campaign, as well as curating, preserving, annotating, enriching, and promoting the re-use of themand promoting the re-use of them

Give to Give to organizationsorganizations responsible for evaluation initiatives an responsible for evaluation initiatives an active roleactive role in this processin this process they should take a leadership role in developing a comprehensive strategy for they should take a leadership role in developing a comprehensive strategy for

long-lived digital data collections and drive the research community through this long-lived digital data collections and drive the research community through this process in order to improve the way of doing researchprocess in order to improve the way of doing research

they should take care also of defining guiding principles, policies, best practices they should take care also of defining guiding principles, policies, best practices for making use of the scientific data produced during the evaluation campaign for making use of the scientific data produced during the evaluation campaign itselfitself



Internationalization of the User InterfaceInternationalization of the User Interface

77

Bulgarian Petya Osenova, Kiril Simov

Czech Pavel Pecina

English Marco Dussin

French Jacques Savoy

German Thomas Mandl

Indonesian Mirna Adriani

Italian Marco Dussin

Portuguese Paulo Rocha, Diana Santos

Spanish Julio Villena Román



Identification: Digital Object Identifiers (DOI)Identification: Digital Object Identifiers (DOI)

DOIs DOIs allow us to allow us to uniquelyuniquely identify a digital object identify a digital object are are persistentpersistent and and actionableactionable aim especially at the intellectual propertyaim especially at the intellectual property

We assign DOIs to:We assign DOIs to: collections − prefix 10.2453collections − prefix 10.2453 topics − prefix 10.2452topics − prefix 10.2452 experiments − prefix 10.2415experiments − prefix 10.2415 pools − prefix 10.2454pools − prefix 10.2454 statistical tests − prefix 10.2455statistical tests − prefix 10.2455

88

10.2415/AH-BILI-X2BG-CLEF2007.JHU-APL.APLBIENBGTD4

http://www.medra.org



DOI ResolutionDOI Resolution

99

http://dx.doi.org



Experiment MetricsExperiment Metrics

1010



Experiment StatisticsExperiment Statistics

1111



Experiment PlotsExperiment Plots

1212



Task StatisticsTask Statistics

1313



Task PlotsTask Plots

1414



Appendices (1/2)Appendices (1/2)

1515



Appendices (2/2)Appendices (2/2)

1616



ParticipationParticipation

1818



Participation by CountryParticipation by Country

1919



Tasks and CollectionsTasks and Collections

Monolingual and bilingual Monolingual and bilingual tasks have principally offered for tasks have principally offered for Central European languages: Bulgarian, Czech and Central European languages: Bulgarian, Czech and HungarianHungarian

Topics in 16 languagesTopics in 16 languages European languages: Bulgarian, Czech, English, French, Hungarian, European languages: Bulgarian, Czech, English, French, Hungarian,

Italian and SpanishItalian and Spanish non-European languages (for X2EN): Amharic, Chinese, Indonesian, non-European languages (for X2EN): Amharic, Chinese, Indonesian,

OromoOromo Indian sub-task: Bengali, Hindi, Marathi, Tamil and TeluguIndian sub-task: Bengali, Hindi, Marathi, Tamil and Telugu

2020

Language

Task Collection

Bulgarian

Monolingual BG, Bilingual X2BG

Sega 2002, Standart 2002, Novinar 2002*

Cezch* Monolingual CS, Bilingual X2CS

Mlada fronta DNES 2002, Lidové Noviny 2002

Hungarian

Monolingual HU, Bilingual X2HU

Magyar Hirlap 2002

English Bilingual X2EN (Indian sub-task)

LA Times 2002*



Participation by TaskParticipation by Task

2121

172 submitted 172 submitted runsruns



Runs by Source LanguageRuns by Source Language

2222



Monolingual BulgarianMonolingual Bulgarian

2424



Monolingual CzechMonolingual Czech

2525



Monolingual HungarianMonolingual Hungarian

2626



Monolingual English*Monolingual English*

2727



Relevance Feed-back:

probabilistic RFmutual information RF

Relevance Feed-back:

probabilistic RFmutual information RF

Morphological Lemmatizer

Stemming vs 4-grams

impact on individual topics but not on averageblind relevance feedback can be detrimental

Stemming vs 4-grams


Stemming vs 4-grams


Linguistic

Stemmers:both light and

aggressiveIndexing: word-based

or 4-

grams

word decompounding

Indexing: word-based

or 4-

grams

Indexing: word-based

or 4-

grams

Linguistic

Stemmers:both light and

aggressive

Main emphasis: stemming morphological analysis relevance feed-back

Approaches to Monolingual RetrievalApproaches to Monolingual Retrieval

2828

NLP techniques Named Entity

Recognition


Recognition


Recognition



Bilingual X Bilingual X English English

3030



Approaches to Bilingual X2ENApproaches to Bilingual X2EN

3131

Main emphasis: bilingual dictionaries machine translation coverage of lexicons use of pivot languages

bilingual dictionaries and pivot languages

query expansion with RF

parallel corpora

translation ambiguity resolution with a graph based approach

lexicon coverage with a pattern-based approach

Afaan Oromo stemmer

stop list creation

bilingual Oromo-English dictionary creation

Bilingual Hungarian to English

bilingual dictionary

exploiting Wikipedia to remove improbable translations

Best Bilingual Best Bilingual English system is English system is

aboutabout88%88% of the best of the best

monolingual monolingual systemsystem



Bilingual X2EN: Indian SubtaskBilingual X2EN: Indian Subtask

3232

bilingual dictionary OOV using a rule-based

approach for transliteration and edit distances

translation disambiguation via a page-rank style algorithm







statistical MT system trained on parallel aligned sentences

language models

statistical MT system trained on parallel aligned sentences

language models

Hindi-English and Telugu-English dictionaries created in one week

TFIDF approach combined with boolean operators

Hindi-English and Telugu-English dictionaries created in one week

TFIDF approach combined with boolean operators

bilingual dictionaries

stop list creation

stemming and n-gram

bilingual dictionaries

stop list creation

stemming and n-gram

limited linguistic resources

phoneme-based transliterations to generate equivalent English queries

stemmers and morphological analyzers if available

limited linguistic resources

phoneme-based transliterations to generate equivalent English queries

stemmers and morphological analyzers if available

main mono and bilingual tasks: track organisation and results analysis

Documents