detection of embryonic research topics by analysing semantic topic networks
TRANSCRIPT
Detection of Embryonic Research Topics
by Analysing Semantic Topic Networks
Angelo Antonio Salatino, Enrico Motta
@angelosalatino
SAVE-SD @ WWW2016
Detecting Topic Trends
• In a recognised research area we can find
two main stages: – initial stage
– recognised
• Can we intervene before?
Hypothesis
• We hypothesise the existence of an earlier
embryonic phase:
– The topic itself has no label, but
– We theorize that they can be detected by
analysing the dynamics of already established
research areas
Experiment
• Dataset
– Semantically-enhanced co-occurrence graph
• Selection Phase
– Debutant topics vs. Control group
• Analysis Phase
– Statistical analysis of the two populations
Experiment: Dataset
• From the topic network
we selected two groups
of topics:
– debutant group: topics
that made their debut in
the period between
2000 and 2010
– control group: already
existing in the decade
2000-10Semantic Topic Networks using Klink-2 by
Osborne et al. @ ISWC 2015
semantic
web
technology
semantic
web
semantic
web
technologies
ontology
mapping
ontology
matching
case
study
knowledge
management
systems
knowledge
management
system
linked
datum
linked
data
fastimplementation
Experiment: Selection Phase
For each testing topic we have:
Experiment: Analysis Phase
Clique metric:
• Harmonic mean
• Arithmetic mean
Timeline metric:
• Linear regression of the time series
• Difference between the extreme
values
𝛼 slope
Findings
• We performed two evaluations over 3 million
publications
• Preliminary Evaluation:
– 2 topics in the debutant group (Semantic Web
and Cloud Computing)
– Tested all the combination of the mentioned
techniques
• Evaluation:
– 50 topic in both debutant and non-debutant group
Findings: Preliminary Evaluation
• AM-N: arithmetic mean and the
difference between the two
extreme values;
• AM-CF: arithmetic mean and the
linear interpolation;
• HM-N: harmonic mean and the
difference between the first and
the last values;
• HM-CF: harmonic mean and the
linear interpolation.Sp
litte
d b
y ye
ar
p-value = 7.0280•10-12Semantic Web Cloud Computing
Findings: Interesting Insights
Findings: Evaluation
• We used different
sizes of the subgraph
associated to each
testing topic
p-values ≤ 1.28•10-51
Conclusion
• Our findings confirm the initial hypothesis
• Next step:
– Automatic detection of embryonic topics by
analysing the topic network and identify sub-
graps exhibiting such dynamics
– Analyse dynamics in other networks (e.g.,
authors and venues)