european language resources association evaluations and language resources distribution agency
DESCRIPTION
European Language Resources Association Evaluations and Language resources Distribution Agency. An Exit Strategy to Capitalise on the CLEF Evaluation Campaigns. Kevin McTAIT ELRA/ELDA 75013 Paris France [email protected] http://www.elda.fr/. Objective. Objectives of CLEF Workpackage 5 :. - PowerPoint PPT PresentationTRANSCRIPT
CLEF 2003 ELRA/ELDAELRA/ELDAKM/1
European Language Resources AssociationEvaluations and Language resources Distribution Agency
Kevin McTAITELRA/ELDA75013 Paris
France
[email protected]://www.elda.fr/
An Exit Strategy to Capitalise on the CLEF Evaluation Campaigns
CLEF 2003 ELRA/ELDAELRA/ELDAKM/2
Objective
1. Capitalise on data collection efforts during CLEF campaigns
2. Enable reproduction of experimental conditions i.e. same reusable training and test data to other players in R&D community for benchmarking purposes
Objectives of CLEF Workpackage 5:
CLEF 2003 ELRA/ELDAELRA/ELDAKM/3
Implementation PlanImplementation in 3 stages:
1. Simplify negotiation of distribution rights of data collections• Secure rights for distribution post-CLEF
2. Produce Evaluation Package• Data, scoring tools, methodologies, protocols, metrics DVD/CD• Documentation, specifications, validation reports, quality stamp• Enable CLIR R&D community benchmark CLIR systems - invaluable• Fix costing arrangements (distribution costs etc.)
3. Exploit ELRA/ELDA’s distribution and promotion procedures• ELRA catalogue• Long term availability and wide audience (all LE areas, even outside CLIR)• Communication: website, newsletter, members news, conferences (LREC,
LangTech, ACL etc.)• Task similar to LRs distribution (raison d’être ELRA/ELDA)• Clearing house for HLT Evaluation & Evaluation Resources
CLEF 2003 ELRA/ELDAELRA/ELDAKM/4
Step 1 – Distribution Rights• Examples of Data collections used for free within CLEF• “Le Monde”
– Specific distribution and end-user agreement for use within CLEF– already distributed by ELRA (outside CLEF)
• “LA Times”– Redistributed by NIST for research/evaluation purposes– Non-expiring letter of agreement
By 2003:– Most owners/providers of data collections should have granted distribution
rights to ELDA (CLEF campaign vs. post-CLEF)– Agreement on use of data collections post-CLEF for further evaluations
(evaluation package)– Agreement on prices with data owners/providers (lowest possible)
CLEF 2003 ELRA/ELDAELRA/ELDAKM/5
Resources (1)DATASET 2001 2002 2003
Multilingual SDA collection (F,G,I)
1994 X X X
1995 X X X
Russian dataIzvestia
1994
1995 X
French (Le Monde)1994 X X X
1995
German dataset Frankfurter Rundschau
1994 X X X
1995
Der Spiegel1994 X X X
1995 X X
Italian dataset La Stampa1994 X X X
1995 X
Spanish dataset EFE1994 X X X
1995 X
Dutch datasetNRC HandelsbladAlgemeen Dagblad
1994 X X X
1995 X X X
CLEF 2003 ELRA/ELDAELRA/ELDAKM/6
Resources (2)DATASET 2001 2002 2003
British English datasetGlasgow Herald
1994
1995 X
Finnish dataset Aamulehti collection
Nov/Dec 1994 X X
1995 X X
Swedish datasetTidningarnas Telegrambyrå
1994 X X
1995 X
Image CLEF X X
American English datasetLA Times (éd. 2002)
X X X
GIRT database X
GIRT-4 database X
AMARYLLIS database X
CLEF 2003 ELRA/ELDAELRA/ELDAKM/7
Step 2 – Evaluation Package• Previous experiment: AMARYLLIS evaluation package
– Information Retrieval for French Language– 2 campaigns: 1996-97, 1998-99– Organised by: INIST (Technical and Scientific information Institute),
AUPELF (Association of francophone universities), French Ministry of Research
– Datasets: Le Monde, scientific abstracts, books, multilingual data• 2001 AMARYLLIS Package:
– Data collections, topics, documentation– Evaluation tools i.e. trec_eval (NIST acknowledged)– Distributed at cost (therefore cheap! i.e. 45 or 100 Euros)
• AURORA evaluation package – evaluation of front-end feature extraction for distributed speech recognition systems
• Contents validated by consortium, subsequently by external centres• Enable duplication of experimental conditions
CLEF 2003 ELRA/ELDAELRA/ELDAKM/8
Step 3 – Distribution Method• ELRA catalogue
• Promotion and distribution plan tested and proven: mailing lists, web site, quarterly newsletters (~1200 recipients)
• Conferences:– LangTech 2003, COLING, ACL 2004, special (double) issue of IR
journal dedicated to CLEF
– LREC 2004 – extended keynote speaker: CLEF and IR
• Widespread and regular distribution
• Simplified licensing scheme – distribution/end-user contracts
• Low price (data owner fee + distribution costs only)
• ELDA – now Evaluations and Language resources Distribution Agency
CLEF 2003 ELRA/ELDAELRA/ELDAKM/9
Why ELRA/ELDA?• Clearing house for LRs (Speech, text corpora, lexica, mulitmodal)/
– Commission, production, validation, distribution LRs in legally sound framework
• Experience in the production, validation, packaging and distribution of Language Resources (+legal issues)
• Evaluation and Evaluation Resources is related activity (HLT developers/evaluators are users of LRs)
• Evaluation infrastructure/network of (R&D) centres providing evaluation resources, software, methodologies, protocols
• Carry out independent evaluation (ethical)
ELRA/ELDA (evaluation department) has set up a European clearing house for HLT evaluation in the same way that ELDA has become a major clearing house for Language Resources.
CLEF 2003 ELRA/ELDAELRA/ELDAKM/10
Evaluation Experience• AURORA
• AMARYLLIS
• ARCADE/ROMANSEVAL
– Word sense disambiguation
• TC-STAR(_P)
– Speech-to-Speech Translation
• Technolangue/EVALDA
– Bilingual alignment, terminology extraction, machine translation, Q/A systems, parsing technology, BN transcription, speech synthesis, man-machine dialogue systems
CLEF 2003 ELRA/ELDAELRA/ELDAKM/12
AMARYLLIS (Multilingual/Parallel corpora)
Promoting the creation of corpora and evaluation procedures for the French language
(i) Evaluation of information retrieval systems in French text corpora
(ii) Methodology of evaluation for similar search tools
Evaluation Projects
CLEF 2003 ELRA/ELDAELRA/ELDAKM/13
ARCADE/ROMANSEVAL
Promoting research in the field of multilingual alignment
• Evaluation of parallel text alignment systems
In collaboration with SENSEVAL/ROMANSEVAL exercise on word-sense disambiguation for Romance languages
Evaluation Projects
CLEF 2003 ELRA/ELDAELRA/ELDAKM/14
TC-STAR(_P)
Preparatory Action for Speech to Speech Translation
• WP: Language Resources and Evaluation Infrastructure
EU funded preparatory project (6th Framework) for TC-STAR project (Technology and Corpora for Speech to Speech Translation).
Evaluation Projects
CLEF 2003 ELRA/ELDAELRA/ELDAKM/15
Technolangue/EVALDA
Permanent evaluation infrastructure for French
• Evaluation for French HLT
French government funded project for the evaluation of 8 human language technologies.
Evaluation Projects
CLEF 2003 ELRA/ELDAELRA/ELDAKM/16
Technolangue
• TechnoLangue and the EVALDA project– • Corpus Alignment– • Terminology
extractions– • Machine Translation– • Syntactic Parsers– • Q/A Systems– • Broadcast News
Transcription Systems– • (Text to ) Speech
Synthesis– • Dialogue Systems
• (1.2M€ budget)
CLEF 2003 ELRA/ELDAELRA/ELDAKM/17
Technolangue/EVALDAA Permanent infrastructure that would focus on:
R&D on (all) Evaluation issues
Elaborations of Evaluation protocols, assessment tools,
Production of Language Resources and Validation
Coordination team for the management and supervision of all projects
Logistics and support
Capitalisation of the outcome of each and every project (evaluation resources, tools, methodologies, protocols, best-practices)
ELDA evaluation department operational: expanding team of engineers