chances and challenges in comparing cross-language retrieval tools

Chances and Challenges in ComparingCross-Language Retrieval Tools

Giovanna RodaVienna, Austria

Irf Symposium 2010 / June 3, 2010

CLEF-IP: the Intellectual Property track at CLEF

CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1

organized by the IRF

first track ran in 2009

running this year for the second time

1http://www.clef-campaign.org

What is an evaluation track?

An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.

produces experimental data that can be analyzed and used toimprove existing systems

fosters exchange of ideas and cooperation

produces a reusable test collection, sets milestones

Test collection

A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.

Test collection

Clef–Ip 2009: the task

The main task in the Clef–Ip track was to find prior art for agiven patent.

Prior art search

Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.

Clef–Ip 2009: the task

The main task in the Clef–Ip track was to find prior art for agiven patent.

Prior art search

Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.

Participants - 2009 track

1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)

2 Univ. Neuchatel - Computer Science (CH)

3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)

4 University of Tampere - Info Studies (FI)

5 Interactive Media and Swedish Institute ofComputer Science (SE)

6 Geneva Univ. - Centre Universitaired’Informatique (CH)

7 Glasgow Univ. - IR Group Keith (UK)

8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)

9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)

10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)

11 Dublin City Univ. - School of Computing (IE)

12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)

13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)

14 Technical Univ. Valencia - Natural LanguageEngineering (ES)

15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)

15 participants

48 experimentssubmitted for the maintask

10 experimentssubmitted for thelanguage tasks

15 participants

2009-2010: participants

2009-2010: evolution of the CLEF-IP track

1 task: prior art search

prior art candidate searchand classification task

targeting granted patents

patent applications

15 participants

20 participants

all from academia

4 industrial participants

families and citations

include forward citations

manual assessments

expanded lists of relevantdocs

standard evaluation mea-sures

new measure: pres, morerecall-oriented

patent applications

15 participants

20 participants

all from academia

manual assessments

2009 2010

patent applications

15 participants

20 participants

all from academia

manual assessments

2009 2010

1 task: prior art search prior art candidate searchand classification task

patent applications

15 participants

20 participants

all from academia

manual assessments

2009 2010

targeting granted patents patent applications

15 participants

20 participants

all from academia

manual assessments

2009 2010

15 participants 20 participants

all from academia

manual assessments

2009 2010

all from academia 4 industrial participants

manual assessments

2009 2010

families and citations include forward citations

manual assessments

2009 2010

manual assessments expanded lists of relevantdocs

2009 2010

manual assessments expanded lists of relevantdocs

What are relevance assessments

A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.

The CLEF-IP test collection:

target data: 2 million EP patents

queries: full-text patents (without images)

relevance assessments: extended citations

Relevance assessments

We used patents cited as prior art as relevance assessments.

Sources of citations:

1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications

2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent

3 opposition procedures: patents cited to prove that a grantedpatent is not novel

Extended citations as relevance assessments

direct citations and their families

direct citations of family members ...

... and their families

Patent families

A patent family consists of patents granted by different patentauthorities but related to the same invention.

simple family all family members share the same priority number

extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family

Patent families

Patent documents are linked bypriorities

Patent families

INPADOC family.

Patent families

Clef–Ip uses simple families.

Relevance assessments 2010

Expanding the 2009 extended citations:

1 include citations of forward citations ...

2 ... and their families

This is apparently a well-known method among patent searchers.

Zig-zag search?

How good are the CLEF-IP relevance assessments?

CLEF-IP uses families + citations:

how complete are extendedcitations as a relevanceassessments?

will every prior art patent beincluded in this set?

and if not, what percentageof prior art items are capturedby extended citations?

when considering forwardcitations, how good areextended citations as a priorart candidate set?

Feedback from patent experts needed

Quality of prior art candidate sets has to be assessed

Know-how of patent search experts is needed

at Clef–Ip 2009 7 patent search professionals assessed 12search results

the task was not well defined and there weremisunderstandings on the concept of relevance

amount of data was not sufficient to draw conclusions

Some initiatives associated with Clef–Ip

The results of evaluation tracks are mostly useful for the researchcommunity.

This community often produces prototypes that are of littleinterest to the end-user.

Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation

developed at Matrixware

service-oriented architecture - available as a a Web service

allows to replicate IR experiments based on classicalevaluation model

tested on the CLEF-IP data

customized for the evaluation of machine translation

Spinque

a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)

introduces search-by-strategy

provides optimized strategies for patent search - tested onCLEF-IP data

transparency: understand your search results to improvestrategy

Spinque

Clef–Ip 2009 learnings

The Humboldt University implemented a model for patent searchthat produced the best results.

The model combined several strategies:

using metadata (IPC, ECLA)

indexes built at lemma level

an additional phrase index for English

crosslingual concept index (multilingual terminologicaldatabase)

Some additional investigations

Some citations were hard to find

% runs class≤ 5 hard

5 < x ≤ 10 very difficult

10 < x ≤ 50 difficult

50 < x ≤ 75 medium

75 < x ≤ 100 easy

Some citations were hard to find

% runs class≤ 5 hard

5 < x ≤ 10 very difficult

10 < x ≤ 50 difficult

50 < x ≤ 75 medium

75 < x ≤ 100 easy

We looked at the content of citations and citing patents.

Ongoing investigations.

Thank you for your attention.

chances and challenges in comparing cross-language retrieval tools

Technology

ir group keith

cooperative action aimed

comparing dierent techniques

machine learning lab

additional phrase index

uspto requires applicants

natural language engineering

dierent patent authorities