semantic similarity assessment to browse resources exposed as linked data: an application to habitat...
DESCRIPTION
INSPIRE 2011, slideshowTRANSCRIPT
![Page 1: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/1.jpg)
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets R. Albertoni, M. De Martino, Institute for Applied Mathematics and Information Technologies National Research Council (CNR), Italy
INSPIRE 2011, Edinburgh, June 29, 2011
![Page 2: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/2.jpg)
Outline
! Linked data - Motivation
! EUNIS Habitat and Species
! Asymmetric and context dependent Semantic Similarity ! Two contexts
! Examples of assessments
! Semantic similarity – Query refinement ! searching for geographical data set
! Conclusion and remarks
![Page 3: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/3.jpg)
Linked Data
Why Linked data ?
! Data Portability across current Data Silos
! HTTP based Open Database Connectivity
! Platform Independent Data & Information Access Linked Data Spaces –
! Serendipitous Discovery of relevant things via the Web
Examples of geographical related linked data datasets
EARTH, GEMET, EUNIS SPECIES & SITE, LINKED GEO DATA, GEONAMES …
Items in “why Linked data” are borrowed from the Kingsley Idehen’s presentation “Creating_Deploying_Exploiting_Linked_Data2”
![Page 4: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/4.jpg)
What can we do with linked data?
Applications already successful:
! Improve/enrich the result returned by search engine (RDF/RDFa snippets) (Google, Yahoo)
! Linked data driven mesh-ups considering data from different sources (LOD Graph,…)
What else we can do?
! We want to push ahead with Serendipitous Discovery supporting decision making by analyzing Linked Data sources
! Tools analyzing linked data: Context Dependent Instance Semantic Similarity
! Albertoni R., De Martino M., Asymmetric and context-dependent semantic similarity among ontology instances, Journal on Data Semantics X, Springer Verlag, pp 1-30, (2008).
![Page 5: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/5.jpg)
EUNIS Species-Habitats
![Page 6: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/6.jpg)
EUNIS Habitat and Species mapped in SKOS and published as Linked Data
skos:prefLabel
URI: http://linkeddata.ge.imati.cnr.it:2020/…/B2.1
skos:description
![Page 7: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/7.jpg)
Species and Habitats are instances of SKOS schema
skos:description “Beach and upper beach formations, mostly of annuals of the low … ….. characteristic are [Cakile edentula], [Polygonum norvegicum] ([Polygonum oxyspermum ssp. raii]), [Atriplex longipes] s.l., [Atriplex glabriuscula], [Mertensia maritima].
Species are easily identifiable in the Habitat title and description !!!!
We didn’t use SILK, We just developed an ad hoc interlinking procedure in JENA
![Page 8: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/8.jpg)
Applying semantic similarity on EUNIS Species-Habitats
Details among context formalization and mathematical formulas behind our semantic similarity are available in Albertoni R., De Martino M., Asymmetric and context-dependent semantic similarity among ontology instances, Journal on Data Semantics X, Springer Verlag, pp 1-30, (2008).
![Page 9: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/9.jpg)
Definition of contexts- parameterizations of our instance similarity
Context 1:“habitat species-based similarity” habitats are compared according to the species that they host or vice versa
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
[skos:Concept]->{{},{(skos:relatedMatch, Inter)}
Context 2: “taxonomy-based similarity” habitats or species instances are compared with respect to their position in the taxonomy hierarchy
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
[skos:Concept]->{ {},{(skos:broader, Inter)}}
You can have contexts as complex as you want, for example 1) considering different ontology schemas 2) providing recursive similarity assessment
![Page 10: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/10.jpg)
Context 1:“habitat species-based similarity” habitats are compared according to the species that they host or vice versa
PREFIX skos: <http://www.w3.org/2004/02/skos/core#> [skos:Concept]->{{},{(skos:relatedMatch, Inter)}
SIM(B211,X)=SIM(X, B211)=0
SIM(B211,X)=1/3 SIM(X,B211)=1/2
SIM(B211,X)=2/4 SIM(X,B211)=1
SIM(B211,X)=SIM(X, B211)=1
![Page 11: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/11.jpg)
Context 2: “taxonomy-based similarity” habitats or species instances are compared with respect to their position in the taxonomy hierarchy
PREFIX skos: <http://www.w3.org/2004/02/skos/core#> [skos:Concept]->{ {},{(skos:broader, Inter)}}
JENA RULES to get skos:broader as transitive an reflexive relations in order to compare nodes according to their ancestors
(?x skos:broader ?y) (?y skos:broader ?z)-> (?x skos:broader ?z) (?y skos:broader ?z)-> (?y skos:broader ?y)
![Page 12: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/12.jpg)
Our semantic similarity was adapted to work with Linked Data
(Here we have consider fairly “harmonized” linked data sets)
Semantic similarity design enhancements:
! Direct access to linked data (No anymore centralized ontology driven repositories): ! (i) Follow your nose approach, (ii) RDF Dumps, (iii) SPARQL End Points
! Increased independence from the ontology schema ! CONTEXTs can mix up different light weighted ontology schemas, since it is
common practice in Linked data.
! A reasoner to add simple RDF entailments
Quite challenging when we consider sources that are not “harmonized” ! non-authoritative resources, heterogeneous schema, non-consistently identified entities ! Riccardo Albertoni, Monica De Martino: Semantic Similarity and Selection of
Resources Published According to Linked Data Best Practice. OTM Workshops 2010, LNCS vol. 6428/2010
![Page 13: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/13.jpg)
Result considering Habitats and sub habitats of Coastal shingle (B2)
Context A if SIM(X,Y)=1 and SIM(Y,X)=1 than Y contains the same species of X; if SIM(X,Y)=1 and SIM(Y,X)<1 than Y contains the species of X but the vice versa is not true; SIM(X,Y) is proportional to the percentage of species in X that are contained in Y out of the overall species of X.
![Page 14: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/14.jpg)
Comparing species according to habitats they can be found in
![Page 15: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/15.jpg)
HOW to USE IT Example: Searching for data
• you might want similarity to refine your keyword query • habitats and species can be deployed as Thesaurus/controlled vocabulary
ADVANTAGES in our approach wrt other similarities • Different contexts even more personalized suggestions • Asymmetry/Containment Highlighting even more information when browsing the refinement alternatives
![Page 16: Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54bc9b444a79598f528b4618/html5/thumbnails/16.jpg)
Conclusion ! After publishing your data, let’s start to consume Linked Data not
only for meshing up !! ! Assumed data is properly interlinked, we can consume data from
different distributed sources and mixing up light weighted ontologies\schemas.
! The more dataset are interlinked, the more are the potential contexts and similarity applications
! Here we presented some very simple examples ! We can define more complex context considering instances’ relations
and properties ! Our semantic similarity is a working prototype written in JAVA/JENA
! Future work ! Further uses cases (Do you fancy trying our semantic similarity on your
data? Let’s talk about it) ! Developments of a front end to define user-driven contexts ! Further reengineering of the prototype to scale up even more complex
use cases