1PROVE’09
The use of competence ontology
for network building
Kafil Hajlaoui, Xavier Boucher, Michel Beigbeder, Jean Jacques Girardot Ecole Nationale Supérieure des Mines de Saint Etienne - France
{hajlaoui, boucher, mbeigd, girardot}@emse.fr
2PROVE’09
Content
••
ContextContext
and objectives : and objectives : identification of collaborative networksidentification of collaborative networks••
Information extraction on Information extraction on activityactivity
fieldsfields
: : briefbrief
synthesissynthesis
of the of the resultsresults••
Information extraction on Information extraction on enterpriseenterprise
competencescompetences
: : principlesprinciples••
Key Key stepssteps
of the of the methodologymethodology
usedused
to to buildbuild
a a ««
CompetenceCompetence
TracesTraces
»» OntologyOntology
••
Discussion on the use of the Discussion on the use of the ontologyontology
to to extractextract
««
enterpriseenterprise competencecompetence
tracestraces
»»
: : association association withwith
semanticsemantic
patterns and patterns and
performances of the systemperformances of the system••
ConclusionConclusion
3PROVE’09
Introduction
PotentialCollaborativeNetworks
E1
En
Enterprises
described by some
characteristic pieces
of information
Context : to support the creation of collaborative networks of firms
2 key
pieces
of information :Complementarities
of activity
fieldsSimilarity
of competencies(Results
of M. Benali PhD
thesis, based
on the work
of Richardson –
1959)
4PROVE’09
IntroductionHypotheses
:Open universe
provided
by the webPublic information : web sites
E1
En
Virtual Breeding Environment
Virtual organization
General information on:Activity
fieldsCompetencies
+Clustering
method
Market
opportunity+
Data base informationShared
by the partners
Semi-automatic
extraction of informationfrom
the web sites
5PROVE’09
Information Extraction MechanismsCo-operation between firms = Complementary activities
& Similar competencies
IEM-1 InformationExtraction on activityfields
IEM-2 InformationExtraction on competencies
DE
SS Identification of
Collaborativecorporate networks
Web site
• To develop an automated information extraction mechanism, used before applying a clustering algorithm for network identification• Scientific issues : to deal with non-structured information available through websites. How can we use additional semantic information from the “business domain” ?• Application domain for the research : mechanical industry.
6PROVE’09
« Request » information vectors(Each firm)
Matching(NAF identification)
« Document » information vectors(each class of the NAF Code)
IEM-
1 : Activity Field Identificationbased on indexing procedure
Web sites
Controlledindexing
StructuredSemantic resource
(NAF code)
Vectorial
Model
Connexionnist
Model
Conclusions• The right NAF class is
found
at
~ 90 %• Precision
~60%, recall
~95%•
So good performance of the indexing procedure, however
not any
clear domination of the connexionnist
model
Precision Recall
7PROVE’09
IEM-
1 : Activity Field Identificationbased on indexing procedure
Input : complementary degree among NAF Class codes
Complementarity activity Graph
ClusteringProcedure
8PROVE’09
IEM-1 InformationExtraction on activityfields
IEM-2 InformationExtraction on competencies
DE
SS Identification of
Collaborativecorporate networks
Web site
IEM-
2 : Competence Identificationbased on ontology and pattern maching
Informational
context
:
•
Complexity
of the notion of «
Enterprise competence
»
: linked
to technologies, to human
resources, to methods
and know-how at
use in each
company
•
Necessity
of linguistic
approach
: lots of distincts terms
and expressions can bring
pieces
of information ; semantic
ambiguity
(context) ; synonymy
etc…
• No structured
semantic
resource
available
9PROVE’09
IEM-
2 : Competence Identificationbased on ontology and pattern maching
Ontology
Semantic
patterns
Conceptual structure of « competencetraces »
Comparison of distinct companies
Similarity
of «
competence
traces
»
Deal with ambiguïty and other semanticissues during the extraction procedure
10PROVE’09
Enterprise competence
?
•The overall
competence
of a firm•Emerges as a combination
of capabilities, notably
the technological
and methodological
capabilities…•
These
capabilities
results
from
the utilisation of internal
resources
: human, technical, informational, organisational
resources
Ontolology
?
• Of competence
traces•
Built
with
a methodology
which
provide
a structured
approach
to control the semantic
issues : ARCHONTE was
selected
operationalisation
Termsof the domain
differential
ontology reference
ontology computable ontology
ARCHONTE METHODOLOGY
Ontology Building with Archonte
11PROVE’09
Ontology FormalisationOntology concepts have been structured on 3 distinct conceptual levels (genericity)
Meta-physicalConcepts
StructuringConcepts
“Parataxic”Concepts
BasicCompetenceModel
GenericOntology
DomainOntology
CompetenceTraceOntology
12PROVE’09
Generic Ontology :
Top –
Down approachCompetencies
Capability
Resources
Technological Capability
Methodological Capability
Human Resources
Technical Resources
Informational Resources
Organisational Resources
uses1..*
1..*
uses
1..*
1..*
is-a
is-a
is-a
is-a
is-a
is-a
13PROVE’09
Domain Ontology : bottom up approach
14PROVE’09
Operationalisation
: OWL with “Protégé”
tool
15PROVE’09
Using the ontology for extraction
Pre-treatmentof web sites
data
Patternidentification
Ontology
classesActivation
OntologyPatternDB
CompanyCompetenceTrace
Web sites
16PROVE’09
Pattern representation : example
A specific
pattern in a formal
language
Occurrences of the pattern in a web site
Pertinente expression delimited
within
the corpus
17PROVE’09
The result observed
For each company : which classes of the ontology are considered activated ?
18PROVE’09
Performance study
Activation Precision Recall
Expert 0,84 0,75
System 0,87 0,64
Comparaison of the performance of an expert and of the automatic extraction/activation systemin a task consisting in « activating » the ontology classes for a given web site.
Automatic activation of ontology classes Expert activation of ontology classes
Precision PrecisionRecall Recall
19PROVE’09
Conclusion
With
the objective to assess
a level
of competence
similarity
among
companies:
-
We
have proposed
an approach
which
takes
advantage
of semantic
information from
the domain
(concerning
competence
descriptions) …
-… but which
remains
very
adaptable from
one business domain
to another.
-
The association of patterns and ontology, provides
to the information extraction system a good ability
to identify
correctly
the classes of a «
competence
traces
ontology
», corresponding
to a company’s
web site.
-
Further
work
: to measure
the similarity
among
the competences
traces from distinct companies.