Ontology engineering: Ontology alignment

Download Ontology engineering: Ontology alignment

Post on 18-Nov-2014




7 download

Embed Size (px)




<ul><li> 1. Ontology AlignmentCourse Ontology Engineering </li> <li> 2. Goals of the lecture Understand why ontology alignment is done Know what constructs can be used to express an alignment between two concepts Know what options there are to find mappings 2 </li> <li> 3. Agenda Why ontology alignment? Alignment relations Alignment techniques 3 </li> <li> 4. Why is Ontology Alignment done? 4 </li> <li> 5. Interoperability problem IIA private company wants to participate in a marketplaceE.g. eBay:Home &gt; Buy &gt; Cameras &amp; Photo &gt; Digital Cameras &gt; Digital SLR &gt; Nikon &gt; D40Needed: correspondences between entries of its catalogs and entries of a common catalog of a marketplace. 5 </li> <li> 6. Example use of vocabulary alignment Tokugawa AAT style/period SVCN period Edo (Japanese period) Edo TokugawaAAT is Gettys SVCN is local in-houseArt &amp; Architecture Thesaurus ethnology thesaurus </li> <li> 7. Alignment architecture for P2P </li> <li> 8. Two kinds of interoperability Syntactic interoperability using data formats that you can share XML family is the preferred option Semantic interoperability How to share meaning / concepts Technology for finding and representing semantic links 8 </li> <li> 9. Reusing vocabularies 9 </li> <li> 10. The myth of a unified vocabulary There will always be multiple ontologies Partly overlapping In multiple languages Each with their own perspective 10 </li> <li> 11. Links between ontologies Ontology Alignment / Ontology Mapping use ontologies jointly by defining a limited set of links Benefit from knowledge encoded in the other ontology Enable access across applications/collections. Partial by nature! 11 </li> <li> 12. Why ontology alignment?Summary: There is no single ontology of the world People work with different viewpoints and thus multiple conceptualizations But: these concepts often overlap Semantic relations between ontologies help integrating information sources Currently seen as a major issue in development of distributed (web) systems 12 </li> <li> 13. How do we represent the alignment between two concepts? 13 </li> <li> 14. Link types between concepts in different ontologiesEquality Individual individualowl:sameAs Den Haag = The HagueEquivalence class classowl:EquivalentClass wood-material = woodSubclass class classrdfs:subClassOf aat:Artist wn:ArtistInstance of individual classrdf:type tgn:Africa wn:ContinentDisjoint class classowl:disjointWith aat:wood wn:plastic 14 </li> <li> 15. Types of links between concepts in different thesauriskos:mappingRelation - skos:closeMatch - skos:exactMatch - skos:broadMatch - skos:narrowMatch - skos:relatedMatch 15 </li> <li> 16. SKOS mapping properties- skos:closeMatch - skos:narrowMatch - symmetricProperty -subPropertyOf- skos:exactMatch skos:narrower - subPropertyOf -inverseOf skos:closeMatch skos:broadMatch - transitiveProperty - symmetric property -skos:relatedMatch -subPropertyOf- skos:broadMatch skos:related - subPropertyOf skos:broader -symmetric property - inverseOf skos:narrowMatch 16 </li> <li> 17. Example: partial alignment between citations 17 </li> <li> 18. Example: alignment between XML Schemas 18 </li> <li> 19. Example: alignment between thesauri 19 </li> <li> 20. Types of links between properties in different ontologiesLinks between properties: equivalentProperty subPropertyOf inverseOfE.g. painterOf creatorOf Trick: wn:hyponym subPropertyOf rdfs:subClassOf 20 </li> <li> 21. Types of links between concepts in different ontologies Domain-specific links Van Gogh (ULAN) born-in Groot-Zundert (TGN) Derain (ULAN) related-to Fauve (AAT)) Wandelkaart Pyreneen RANDO.07 Haute- Arige - Vicdessos (Pied Terre) related to Pyrnes (TGN) Part-of relations 21 </li> <li> 22. Alignment Techniques 22 </li> <li> 23. Alignment tools Input: two ontologies, each consisting of a set of discrete entities HTML table headers XML elements Classes Properties Output: relationships holding between these entities (equivalence, subsumption, etc.) + confidence measure. Cardinality (e.g., 1:1, 1:m) 23 </li> <li> 24. Alignment techniques Syntax: comparison of characters of the terms Measures of syntactic distance Language processing E.g. Tokenization, single/plural, Relate to lexical resource Relate terms to place in WordNet hierarchy Taxonomy comparison Look for common parents/children in taxonomy Instance based mapping Two classes are similar if their instances are similar. 24 </li> <li> 25. String-based techniques (1) Exact string match Prefix takes as input two strings and checks whether the first string starts with the second one net = network; but also hot = hotel Suffix takes as input two strings and checks whether the first string ends with the second one ID = PID; but also word = sword </li> <li> 26. String-based techniques (2) Edit distance takes as input two strings and calculates the number of edition operations, (e.g., insertions, deletions, substitutions) of characters required to transform one string into another, normalized by length of the maximum string EditDistance ( NKN , Nikon ) = 0.4 (2/5) </li> <li> 27. Language-based techniques Tokenization parses names into tokens by recognizing punctuation, cases Hands-Free Kits =&gt; hands, free, kits Lemmatization analyses morphologically tokens in order to find all their possible basic forms Kits =&gt; Kit Elimination discards empty tokens that are articles, prepositions, conjunctions . . . a, the, by, type of, their, from </li> <li> 28. Linguistic techniques using WordNet senses A subClassOf B if A is a hyponym of B Pine subClassOf Tree A hasPart B if A is a holonym of B Europe hasPart Greece A = B if they are synonyms Quantity = Amount A disjoint B if they are antonyms or ar siblings in the same part of hierarchy Pine disjoint Oak </li> <li> 29. Linguistic techniques: gloss-based WordNet gloss comparison The number of the same words occurring in both input glosses increases the similarity value. The equivalence relation is returned if the resulting similarity value exceeds a given threshold Maltese dog is a breed of toy dogs having a long straight silky white coat Afghan hound is a tall graceful breed of hound with a long silky coat </li> <li> 30. Structural technique:taxonomy comparison </li> <li> 31. Techniques for Part-of RelationsPhrase (Hearst) patterns: add to is made of gives the its -containing consists of 31 </li> <li> 32. Overview of alignment techniques </li> <li> 33. Alignment issues (1) Nature of the input Underlying data models Schema-level vs. Instance-level Example: Link WordNet to Wikipedia Interpretation of the output Approximate vs. exact Graded vs. absolute confidence Performance varies &gt; semi-automatic alignment. 33 </li> <li> 34. Involving the human in alignment evaluation </li> <li> 35. Evaluation of alignments Judging individual alignments Precision Comparison to a reference alignment Recall Precision? Comparing the logical consequences of the models End-to-end evaluation 35 </li> <li> 36. The intrinsic fuzziness of alignment </li> <li> 37. AATWordNet 37 </li> <li> 38. Literature / acknowledgment Some slides from this lecture are based on a tutorial of Pavel Shvaiko and Jerome Euzenathttp://dit.unitn.it/~accord/Presentations/ESWC05 Some slides are from Antoine Isaac (STICH) 38 </li> </ul>