analyzing the evolution of linked...
TRANSCRIPT
www.moving-project.eu
Mohammad Abdel-Qader1,2, Iacopo Vagliano1, and Ansgar Scherp3 1 ZBW – Leibniz Information Centre for Economics
2 Christian-Albrechts-University in Kiel, Germany 3 University of Essex, United Kingdom
Analyzing the Evolution of Linked Vocabularies
June 14th , 2019, 19th ICWE 2019 11.06.2019- 14.06.2019, Daejeon, Korea
2 of 20
Motivation example 1
Analyzing the Evolution of Linked Vocabularies
schema
voaf
gn
loc
bevon
dcat
igeo
adms
food
oslo person lib
search
idemo
void
3 of 20
Motivation example 2
Analyzing the Evolution of Linked Vocabularies
22.05.12 25.06.12
22.11.12 27.04.13
24.05.13
24.09.13
16.09.13 22.07.15
V2 V3 V4
V2 V1 V3 V4 V6 V5
21.12.13
adms:SemanticAsset
adms:accessURL
adms:SemanticAsset
adms:accessURL
Need to be updated
12 created
terms
5 modified
terms
57 created terms
7 modified terms
23.02.13
V1
food
adms
4 of 20
Research Questions
• RQ1: What is the state of the Network of Linked Vocabularies? • Size? • Density, average degree? • Central nodes?
• RQ2: How are vocabulary terms reused by other vocabularies?
• How many? • Most recent ones? • Change impact?
• RQ3: How do ranking metrics, such as PageRank, HITS, and
Centrality, change during the evolution of the Network of Linked Vocabularies? • Important nodes? • Reuse over time?
Analyzing the Evolution of Linked Vocabularies
5 of 20
Analysis Methodology- Step1
Analyzing the Evolution of Linked Vocabularies
LOV- Linked Open Vocabulary1
636 vocabularies until June 2018
Extract
terms
Own terms
……
.......
……
……
Imported terms
……
.......
……
……
1http://lov.okfn.org/dataset/lov/
6 of 20
Analysis Methodology- Step2
Analyzing the Evolution of Linked Vocabularies
Density
in-/out degree
Average degree
Centrality
Network diameter
PageRank HITS
7 of 20
Analysis Methodology- Step3
Analyzing the Evolution of Linked Vocabularies
Imported terms
term1
term2
term3
term4
term5
term6
……
……
Most recent terms
Outdated terms
8 of 20
Analysis Methodology- Step4
Analyzing the Evolution of Linked Vocabularies
2001 2018
Repeat the methodology
9 of 20
Results-RQ1 State of NeLO 2018
Analyzing the Evolution of Linked Vocabularies
Measure Value
Nodes 994
Edges 7046
Diameter 12
Density 0.007
Degree 7.089
The full results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution
10 of 20
Results-RQ1 State of NeLO 2018
Analyzing the Evolution of Linked Vocabularies
Vocabulary Authority Hub
dcterms 0.305421 0.037978
dce 0.242374 0.037727
foaf 0.234664 0.044112
vann 0.184754 0.045030
skos 0.171827 0.034529
cc 0.113386 0.034723
vs 0.081972 0.040256
voaf 0.080739 0.045920
dctype 0.058152 0.037727
schema.org 0.046659 0.040364
Vocabulary PageRank
dce 0.045954
dcterms 0.027649
skos 0.017678
foaf 0.013986
dcam 0.009152
vann 0.009117
grddl 0.008740
dctype 0.005744
cc 0.005446
vs 0.005005
Top-10 vocabularies for HITS (Hub and Authority) scores in 2018
Top-10 vocabularies for PageRank scores in 2018
11 of 20
Results-RQ2 Reuse of vocabulary terms
Analyzing the Evolution of Linked Vocabularies
Term Importing vocabularies.
dcterms:modified 281
dcterms:title 276
dce:title 266
dce:creator 263
vann:preferredNamespacePrefix 257
dcterms:description 249
vann:preferredNamespaceUri 241
foaf:Person 175
foaf:name 164
cc:license 122
Top-10 terms that are reused by other vocabularies in 2018.
12 of 20
Results-RQ2 Reuse of vocabulary terms
Analyzing the Evolution of Linked Vocabularies
0
2
4
6
8
10
12
14
16
1 2 5 10 15 20
Nu
mb
er o
f v
oca
bu
lari
es
Outdated terms imported
The number of vocabularies that reuse outdated terms by the number of outdated terms reused
13 of 20
Results-RQ2 Reuse of vocabulary terms
Analyzing the Evolution of Linked Vocabularies
123 124
457
828
1,345 1,698
2,435
3,884 5,034
6,816
13,047
19,003
27,157
40,378
52,176
58,059
59,613
62,331
72 68 85
480 555 701 786 877
1,159 1,480
1,993 2,328 2,982
3,781 5,162 5,539 5,786
9,750
1
10
100
1,000
10,000
100,000
Nu
mb
er o
f te
rms
Existing terms
Imported terms
The number of existing terms and the reused terms from other vocabularies
14 of 20
Results-RQ3 NeLO evolution
Analyzing the Evolution of Linked Vocabularies
11 11
20
44
62 80
113
164 225
297
430 536
639 734
851 926 958 994
30 30
66
155
235 316
468
713
996 1373
2389 3166
4168 4940
5769 6407 6731 7047
1
10
100
1,000
10,000
Val
ue
Node
Edges
Number of nodes and edges in NeLOs
15 of 20
Results-RQ3 NeLO evolution
Analyzing the Evolution of Linked Vocabularies
2005 2008
2011 2017
The full results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution
16 of 20
Results-RQ3 NeLO evolution (In/out degree)
Analyzing the Evolution of Linked Vocabularies
0
10
20
30
40
50
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
In-d
egre
e sc
ore
semiointervaloamo
0
50
100
150
200
250
300
Ou
t-d
egre
e sc
ore
vannskosccvs
17 of 20
Results-RQ3 NeLO evolution
Analyzing the Evolution of Linked Vocabularies
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
Pag
eRan
k s
core
Snapshot year
skos
dcam
vann
grddl
dctype
The PageRank scores for the top-five vocabularies for each year
18 of 20
Results-RQ3 NeLO evolution (HITS)
Analyzing the Evolution of Linked Vocabularies
0
0.05
0.1
0.15
0.2
0.25
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Hu
b s
core
vann
skos
cc
0
0.05
0.1
0.15
0.2
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Au
tho
rity
sco
re
vannskosccvs
19 of 20
Conclusions
• From NeLO’s 2018 density and average degree, we think that there is a need to increase the imports/exports relations between vocabularies.
• Only 16% of the existing terms are reused by the other
vocabularies. • Using recommending systems can help to increase the reuse amount
between vocabularies in order to avoid overlap and redundancy.
• The process of checking for the changes on other vocabularies is usually done manually.
• Many vocabularies are up-to-date in NeLO 2018. • 33 vocabularies are still using outdated terms.
Analyzing the Evolution of Linked Vocabularies
20 of 20
Conclusions
• There is a lack of tools to notify ontology engineers about changes in the vocabularies.
• The dynamics of changes has slowed down after some fast
evolution between 2001 and 2010.
• Based on the results, the reusing of terms was initially more common.
• Changes in the vocabularies with high PageRank and HITS scores affects many other vocabularies.
Analyzing the Evolution of Linked Vocabularies
More results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution
21 of 20
TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation
This work is supported by…
Linked Open Citation Database LOC-DB
AND
22 of 20
References
• Abdel-Qader, M., Scherp, A., Vagliano, I.: Analyzing the evolution of vocabulary terms and their
impact on the LOD Cloud. In: ESWC. Springer (2018)
• Cardoso, S.D., Pruski, C., Da Silveira, M., Lin, Y.C., Gro, A., Rahm, E., Reynaud- Delaitre, C.:
Leveraging the impact of ontology evolution on semantic annotations. In: European Knowledge
Acquisition Workshop. pp. 68-82. Springer (2016)
• Dividino, R., Gottron, T., Scherp, A.: Strategies for efficiently keeping local linked open data caches
up-to-date. In: ISWC. pp. 356-373. Springer (2015)
• Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaitre, C.: Understanding semantic mapping
evolution by observing changes in biomedical ontologies. Journal of biomedical informatics 47, 71-
82 (2014)
• Ghazvinian, A., Noy, N.F., Jonquet, C., Shah, N., Musen, M.A.: What four million mappings can tell
you about two hundred ontologies. In: ISWC. pp. 229-242. Springer (2009)
• Gottron, T., Gottron, C.: Perplexity of index models over evolving linked data. In: ESWC. pp. 161-75.
Springer (2014)
• Käfer, T., Abdelrahman, A., Umbrich, J., O'Byrne, P., Hogan, A.: Observing linked data dynamics. In:
ESWC. pp. 213{227. Springer (2013)
Analyzing the Evolution of Linked Vocabularies