an empirical study of vocabulary relatedness and its application to recommender systems

.nju.edu.cn

An Empirical Study of Vocabulary Relatedness

and Its Application to Recommender Systems

Gong Cheng, Saisai Gong, Yuzhong Qu

State Key Laboratory for Novel Software Technology, Nanjing University, China

gcheng@nju.edu.cn

Presented at ISWC2011

Gong Cheng (程龚) gcheng@nju.edu.cn 2 of 36

ws .nju.edu.cn

Vocabulary matching

Measuring term similarity

FullProfessor

FacultyMember

AssistantProfessor

Professor

Faculty

AssistantProfessor

ws .nju.edu.cn

Vocabulary matching

Vocabulary distance

Measuring vocabulary similarity

Semantic Web for Research

Communities (SWRC)

eBiquity Person

Foundational Model of

Anatomy (FMA)

NCBI organismal classification

(NCBITaxon)

0.60.02

ws .nju.edu.cn

Vocabulary matching

Vocabulary distance

Vocabulary relatedness

Measuring vocabulary relatedness

FullProfessor

FacultyMember

AssistantProfessorPhD

Postgraduate-Research-

Degree

not that similar, but somewhat related

ws .nju.edu.cn

Contributions

How to measure vocabulary relatedness?

6 measures, from 4 aspects

How about vocabulary relatedness in real-life cases?

Empirical analysis of 2,996 vocabularies and other 4 billion RDF triples

Where to apply vocabulary relatedness?

Post-selection vocabulary recommendation in vocabulary search

ws .nju.edu.cn

Outline

Data set

Post-selection vocabulary recommendation

Conclusions

ws .nju.edu.cn

Data set statistics

Crawled from February 2010 to May 2011 by

ws .nju.edu.cn

Data set distributions

RDF documents over pay-level domains

ws .nju.edu.cn

Data set distributions

Vocabularies over top-level domains

ws .nju.edu.cn

Outline

Data set

Conclusions

ws .nju.edu.cn

6 numerical measures, from 4 aspects

Semantic relatedness

Explicit

Implicit

Hybrid

Content similarity

Expressivity closeness

Distributional relatedness

Comparison

ws .nju.edu.cn

Measure 1: explicit semantic relatedness

owl:imports

v1 v2 v3

vvRin and between path shortest a ofweight

rdfs:seeAlso

owl:priorVersion

ws .nju.edu.cn

Measure 2: implicit semantic relatedness

owl:inverseOf

v2 v3 v4

t2 t3t4

owl:inverseOf

rdfs:subClassOf

v2 v3 v4

ws .nju.edu.cn

Measure 3: hybrid semantic relatedness

ws .nju.edu.cn

Statistical properties of GE, GI and GE+I

Empirical analysis (1)

ws .nju.edu.cn

Explicit relations between vocabularies

ws .nju.edu.cn

Measure 4: content similarity

Harmonic mean

Maximum similarity between their labels

ws .nju.edu.cn

86 label-like properties

rdfs:label, dc:title, and their subproperties (e.g. skos:prefLabel)

and local name

63.67%

36.33%

Terms and their labels

36.21%

63.79%

Vocabulary distribution

ws .nju.edu.cn

Measure 5: expressivity closeness

MetaTerms

rdfs:domain

owl:inverseOf

owl:TransitiveProperty

rdf:type

Jaccard

ws .nju.edu.cn

4,978 meta-level terms, 469 (9.42%) in >1 vocabulary

an empirical study of vocabulary relatedness and its application to recommender systems

Technology

pengaruh information technology relatedness terhadap kinerja

pengaruh information technology relatedness terhadap

making relatedness a treatment goal

hernia and work relatedness

relatedness = recency of common ancestry

motivation and relatedness

a fuzzy recommender system for eelections - unifr.ch fuzzy...

genome analysis and phylogenetic relatedness of

tutorial: recommender systemswelling/teaching/cs77b... ·...

personalized recommender by exploiting domain based expert...

relatedness, complexity and local growthmotu- ·...

tfr: a tourist food recommender system based on...

relatedness-based multi-entity summarization

multinational enterprises, industrial relatedness and

designing smart specialization policy: relatedness

modes of relatedness in psychotherapy

recommender introduction to recommender systems and

semantic relatedness for all (languages): a comparative...

relatedness 1: ibd and coefficients of relatedness

pedigree relatedness and pseudo-phenotypes as a first ......