automatic annotation of text for bibliometrics...

48
Automatic Annotation of text for Bibliometrics Use M.Bertin, J.P. Descl´ es, B. Djioua and Y. Krushkov University of Paris-Sorbonne LaLICC Laboratory FRANCE May 12, 2006 M.Bertin, J.P. Descl´ es, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Upload: others

Post on 06-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

Automatic Annotation of text for Bibliometrics Use

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov

University of Paris-SorbonneLaLICC Laboratory

FRANCE

May 12, 2006

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 2: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Problematic

What is the basic problem we are trying to solve?Bibliometrics means literally ”book measurement”.How to proceed to qualify relations between authors?

What was the technical solution strategy?We use Contextual Exploration developed by LaLICClaboratory.

What were the basic elements of the research andprogram approach?Identify and categorize automatically semantic relations byusing EXCOM platform (Multilingual COntextualEXploration).

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 3: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Problematic

What is the basic problem we are trying to solve?Bibliometrics means literally ”book measurement”.How to proceed to qualify relations between authors?

What was the technical solution strategy?We use Contextual Exploration developed by LaLICClaboratory.

What were the basic elements of the research andprogram approach?Identify and categorize automatically semantic relations byusing EXCOM platform (Multilingual COntextualEXploration).

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 4: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Problematic

What is the basic problem we are trying to solve?Bibliometrics means literally ”book measurement”.How to proceed to qualify relations between authors?

What was the technical solution strategy?We use Contextual Exploration developed by LaLICClaboratory.

What were the basic elements of the research andprogram approach?Identify and categorize automatically semantic relations byusing EXCOM platform (Multilingual COntextualEXploration).

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 5: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Definition

Pritchard, 1969

The definition and purpose of bibliometrics is to shed light on theprocess of written communications ... by means of counting andanalyzing the various facets of written communication.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 6: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Laws of Bibliometrics

Lotka’s Law

Lotka’s Law describes the frequency of publication by authors in agiven field.

Bradford’s Law

Bradford’s Law serves to librarians in determining the number ofcore journals in any given field.

Zipf’s Law

Zipf’s Law is often used to predict the frequency of words within atext.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 7: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

but ...

However, these treatments ignore the qualitative meanings ofthe bibliography references. For example, authors are”ranked” by how many of her articles have been publish

Bibliographic coupling, co-citation, data mining, ImpactFactor...

Specific search engine like CiteSeer

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 8: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

but ...

However, these treatments ignore the qualitative meanings ofthe bibliography references. For example, authors are”ranked” by how many of her articles have been publish

Bibliographic coupling, co-citation, data mining, ImpactFactor...

Specific search engine like CiteSeer

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 9: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

but ...

However, these treatments ignore the qualitative meanings ofthe bibliography references. For example, authors are”ranked” by how many of her articles have been publish

Bibliographic coupling, co-citation, data mining, ImpactFactor...

Specific search engine like CiteSeer

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 10: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

ProblematicDefinitionLaws of BibliometricsBibliometricProtocol

Protocol

Use FA to find bibliographic references

Indicator allows us to locate textual segment

EXCOM annotate automatically and semantically this textualsegment

We localize automatically bibliographic references to find textualsegment with interesting information

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 11: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Linguistic approach

Now, we have to determinate which type of relation could befound inside scientific articles. This work was done by J.P. Desclesand Y. Krushkov in 2005.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 12: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Categorization

Point of view : The first category is the point of view. It isused extensively in the corpus. Using the point of view, wecan assert our opinion on the question.

Comparison : The second category which we are interested inis the comparison. We often compare the works of theresearchers. If the comparison is neutral, we have to use thecontextual exploration to determine if there is a case ofresemblance or disparity.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 13: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Categorization

Point of view : The first category is the point of view. It isused extensively in the corpus. Using the point of view, wecan assert our opinion on the question.

Comparison : The second category which we are interested inis the comparison. We often compare the works of theresearchers. If the comparison is neutral, we have to use thecontextual exploration to determine if there is a case ofresemblance or disparity.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 14: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Categorization

Information : The category of information is large. Itincludes subcategories such as the hypothesis, the analysisand the result.

Definition : The sentences which contain definitions areimportant.

Appreciation : In this category, the author gives judgementin a positive or negative way about another author.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 15: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Categorization

Information : The category of information is large. Itincludes subcategories such as the hypothesis, the analysisand the result.

Definition : The sentences which contain definitions areimportant.

Appreciation : In this category, the author gives judgementin a positive or negative way about another author.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 16: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Categorization

Information : The category of information is large. Itincludes subcategories such as the hypothesis, the analysisand the result.

Definition : The sentences which contain definitions areimportant.

Appreciation : In this category, the author gives judgementin a positive or negative way about another author.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 17: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

CategorizationTree Representation

Tree of categorization

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 18: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

ISO 690:1987 Documentation – Bibliographic references –Content, form and structureStandardized:

Numerical Type [1]

Alpha-numerical Type [AUT90], [AUT 97], [AUT 94, AUT96a], [AUT 96b], [AUT’99]

Another Type (Author et al., 1997, Author et al., 1999)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 19: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

ISO 690:1987 Documentation – Bibliographic references –Content, form and structureStandardized:

Numerical Type [1]

Alpha-numerical Type [AUT90], [AUT 97], [AUT 94, AUT96a], [AUT 96b], [AUT’99]

Another Type (Author et al., 1997, Author et al., 1999)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 20: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

ISO 690:1987 Documentation – Bibliographic references –Content, form and structureStandardized:

Numerical Type [1]

Alpha-numerical Type [AUT90], [AUT 97], [AUT 94, AUT96a], [AUT 96b], [AUT’99]

Another Type (Author et al., 1997, Author et al., 1999)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 21: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

Unstandardized:

Author (1968,1976,1982)

Author (1958: 274-275)

(Author1, 1944, p.311 ; Author2,1956, p.316)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 22: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

Unstandardized:

Author (1968,1976,1982)

Author (1958: 274-275)

(Author1, 1944, p.311 ; Author2,1956, p.316)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 23: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

International Standards

Unstandardized:

Author (1968,1976,1982)

Author (1958: 274-275)

(Author1, 1944, p.311 ; Author2,1956, p.316)

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 24: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

International StandardsFA et bibliographic references

FA et bibliographic references

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 25: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Platform

A major objective for EXCOM system developped by LaLICCLaboratory is to explore the semantics of text for enhancinginformation extraction and retrieval through automatic annotationof semantic relations.

1 Regex: Annotation of low level which allows the recognition ofsome complex textual expressions which can be described by afinite state machine.

2 Structure: Use of annotations as linguistic markers.

3 Contextual Exploration: It is in this module that ourproposition shows how to extend the XML technology and howit can be used in a system of semantic annotation of texts.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 26: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Platform

A major objective for EXCOM system developped by LaLICCLaboratory is to explore the semantics of text for enhancinginformation extraction and retrieval through automatic annotationof semantic relations.

1 Regex: Annotation of low level which allows the recognition ofsome complex textual expressions which can be described by afinite state machine.

2 Structure: Use of annotations as linguistic markers.

3 Contextual Exploration: It is in this module that ourproposition shows how to extend the XML technology and howit can be used in a system of semantic annotation of texts.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 27: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Platform

A major objective for EXCOM system developped by LaLICCLaboratory is to explore the semantics of text for enhancinginformation extraction and retrieval through automatic annotationof semantic relations.

1 Regex: Annotation of low level which allows the recognition ofsome complex textual expressions which can be described by afinite state machine.

2 Structure: Use of annotations as linguistic markers.

3 Contextual Exploration: It is in this module that ourproposition shows how to extend the XML technology and howit can be used in a system of semantic annotation of texts.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 28: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Architecture

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 29: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules

Developed andimplemented rules forautomatic annotationof categories

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 30: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules : Method subcategory annotation

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 31: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 32: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 33: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 34: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

EXCOM Rules

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 35: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

Result

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 36: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

Result

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 37: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

Result

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 38: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

EXCOM PlatformEXCOM ArchitectureEXCOM RulesApplication

Result

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 39: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

New concept: bibliosemantic

Bibliometric BibliosemanticApproach Statistical Linguistic

Methode Frequency Semantic

Word Textual segment

Result Quantitative Qualitative

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 40: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

Summary

1 Use FA to find bibliographic references

2 Bibliographic references, call indicator, are use to locatetextual segment

3 Categorize semantically and automatically textual segmentusing EXCOM platform

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 41: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

Summary

1 Use FA to find bibliographic references

2 Bibliographic references, call indicator, are use to locatetextual segment

3 Categorize semantically and automatically textual segmentusing EXCOM platform

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 42: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

Summary

1 Use FA to find bibliographic references

2 Bibliographic references, call indicator, are use to locatetextual segment

3 Categorize semantically and automatically textual segmentusing EXCOM platform

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 43: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

And next ?

1 Enlarging size of corpus

2 Enlarging language of corpus

3 Establish Semantic Authors Citation Network

4 Establish categorization of scientific publication

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 44: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

And next ?

1 Enlarging size of corpus

2 Enlarging language of corpus

3 Establish Semantic Authors Citation Network

4 Establish categorization of scientific publication

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 45: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

And next ?

1 Enlarging size of corpus

2 Enlarging language of corpus

3 Establish Semantic Authors Citation Network

4 Establish categorization of scientific publication

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 46: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

And next ?

1 Enlarging size of corpus

2 Enlarging language of corpus

3 Establish Semantic Authors Citation Network

4 Establish categorization of scientific publication

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 47: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

Bibliography

Descles, J.P.Exploration contextuelle et semantique : un systeme expertqui trouve les valeurs semantiques des temps de l’indicatifdans un texte, p.371-400Knowledge modeling and expertise transfert, 1991.

Descles, J.P.Systeme d’exploration contextuelle, p.215-232Co-texte et calcul du sens, 1997.

Krushkov, Y.L’exploration contextuelle des appariements entre lesreferences bibliographiques et les passages textuels dans uncorpus de textes linguistiques.Master, University of Paris IV Sorbonne, 2005.

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use

Page 48: Automatic Annotation of text for Bibliometrics Uselalic.paris-sorbonne.fr/PUBLICATIONS/2006/FLAIRS06BertinM.pres.pdf · Introduction Part I: Linguistic approach Part II: Technical

IntroductionPart I: Linguistic approachPart II: Technical approach

Part III: Informatic implementationConclusion

New conceptSummaryBibliographyQuestions?

I am hungry, not you ?So, small and fast questions.It’s time for lunch !

M.Bertin, J.P. Descles, B. Djioua and Y. Krushkov Automatic Annotation of text for Bibliometrics Use