from changes to dynamics: dynamics analysis of linked open data sources
DESCRIPTION
Presentation at PROFILES 2014 workshop (co-located with ESWC) on measuring the dynamics of linked data sources.TRANSCRIPT
Institute for Web Science & Technologies – WeST
From Changes to Dynamics: Dynamics Analysis of Linked
Open Data Sources
Renata Dividino, Thomas GottronAnsgar Scherp, Gerd Gröner
May 26th, 2014
PROFILES Workshop, Crete
Thomas Gottron PROFILES 26.5.2014, 2Dynamics of LOD
Linked Data Evolves
Linked Data
Growth in Volume!
Thomas Gottron PROFILES 26.5.2014, 3Dynamics of LOD
Linked Data Evolves
Data changes!
Time
Vol
ume
Triples provided by data sources
Thomas Gottron PROFILES 26.5.2014, 4Dynamics of LOD
Effects on Indices and Caches
Impact on the
accuracy of indices!
Thomas Gottron PROFILES 26.5.2014, 5Dynamics of LOD
Updates of Indices and Caches
Linked Data
Index
Which sources to
prioritise in an
update?
Thomas Gottron PROFILES 26.5.2014, 6Dynamics of LOD
Change Metrics
Thomas Gottron PROFILES 26.5.2014, 7Dynamics of LOD
Change Metrics
Comparison of two RDF data sets (e.g. from different points in time) Xi : Set of triple statements
Numeric expression for „distance“ Example:
X1
X2
Δ
Suitable to measure dynamics???
Thomas Gottron PROFILES 26.5.2014, 8Dynamics of LOD
Toy example: Changes Analysis of LOD
1st snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
Thomas Gottron PROFILES 26.5.2014, 9Dynamics of LOD
Toy example: Changes Analysis of LOD
1st snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
2nd snapshot
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstitutePaluno
Thomas Gottron PROFILES 26.5.2014, 10Dynamics of LOD
Toy example: Changes Analysis of LOD
Changes detected between 1st and 2nd snapshot
1. Deleted: <InstituteWEST hasMember Gerd>2. New: <InstitutePaluno hasMember Gerd >
1st snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
2nd snapshot
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstitutePaluno
Thomas Gottron PROFILES 26.5.2014, 11Dynamics of LOD
Toy example: Changes Analysis of LOD
1st snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
2nd snapshot
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstitutePaluno
3rd snapshot
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
Thomas Gottron PROFILES 26.5.2014, 12Dynamics of LOD
Toy example: Changes Analysis of LOD
1st snapshot 2nd snapshot 3rd snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstitutePaluno
Changes detected between 2nd and 3rd snapshot
1. New: <InstituteWEST hasMember Gerd>2. Deleted: <InstitutePaluno hasMember Gerd >
Thomas Gottron PROFILES 26.5.2014, 13Dynamics of LOD
Toy example: Changes Analysis of LOD
1st snapshot 2nd snapshot 3rd snapshot
GerdInstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstituteZBW
InstituteWeST
Thomas
Gerd
Ansgar
Renata
InstitutePaluno
Changes detected between 1st and 3rd snapshot
None!Change metrics capture
differences – not
dynamics!
Thomas Gottron PROFILES 26.5.2014, 14Dynamics of LOD
A Framework for Linked Data Dynamics
Thomas Gottron PROFILES 26.5.2014, 15Dynamics of LOD
Requirements
Dynamics function Θ quantify the evolution of a dataset X over a period of time
Dynamics as amount of evolution
Time
Thomas Gottron PROFILES 26.5.2014, 16Dynamics of LOD
Constructing a Dynamics Function
Function Θ difficult to define directly Indirect definition over a change rate function c(Xt)
Time
Thomas Gottron PROFILES 26.5.2014, 17Dynamics of LOD
Change Rate Function
Also c(Xt) not explicitely known!
But can be approximated! Given snapshots of the data in small time intervals:
The change rate can be approximated via change metrics:
Thomas Gottron PROFILES 26.5.2014, 18Dynamics of LOD
Dynamics Framework
Approximating c(Xt) as step function
Time
Choice of Δ:
Flexible use of
different notions
of change!
Thomas Gottron PROFILES 26.5.2014, 19Dynamics of LOD
Use of Decay Functions
Thomas Gottron PROFILES 26.5.2014, 20Dynamics of LOD
Introduction of Decay
So far: Impact of evolution independent of moment in time Desirable: Focus on certain periods of time
• e.g. recent past Solution:
Decay function f to assign weights to moments in time
Time
Thomas Gottron PROFILES 26.5.2014, 21Dynamics of LOD
Implementing a Decay Function
Exponential decay function:
Incoporated in the framework:
When using the step function approximation of c(Xt) :
Thomas Gottron PROFILES 26.5.2014, 22Dynamics of LOD
Some Results
Thomas Gottron PROFILES 26.5.2014, 23Dynamics of LOD
Experiments
84 snapshots (approx 1.5 years) 652 data sources (PLD) Dynamics on data level
We use data from theDynamic Linked Data ObservatoryWeekly snapshots, 16M triples
Thomas Gottron PROFILES 26.5.2014, 24Dynamics of LOD
Change Rate Function of Seleted Data Sources
Θ = 55.71 , Θdecay = 23.42
dbpedia.org
Θ = 58.45 , Θdecay = 18.48
identi.ca
Θ = 51.75 , Θdecay = 25.03
linkedct.org
Θ = 20.90 , Θdecay = 8.33
dbtune.org
Thomas Gottron PROFILES 26.5.2014, 25Dynamics of LOD
Conclusion
Summary
Framework to capture the dynamics of LOD data sources Configurable to use different change metrics Incorporation of a decay function Values align with intuitive definition
Future Work
Better approximations of the change rate function Incorporation notion of dynamics in update strategies for
LOD indices and caches
Thomas Gottron PROFILES 26.5.2014, 26Dynamics of LOD
Thanks!
Contact:Thomas Gottron
WeST – Institute for Web Science and Technologies
Universität Koblenz-Landau