mapping concepthubberlin
TRANSCRIPT
Semantic Mappingthrough a concept hub
Dagobert Soergel
College of Information Studies, University of Maryland
Department of Library and Information Studies, University at Buffalo
2
Hub
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH
Shipping
Inland water transport
Merchant marine
Harbors
GermanHafen
Mapping through a Hub
3
Outline
• Objective: Interoperability Plus
• KOS concept hub
• Method: Knowledge-based, computer-assisted of canonical representations of concepts
• Resulting knowledge base and applications
4
Objective
Improve semantic-based search of digital content across multiple collections in multiple languages.
• Interoperability between any two participating KOS(Knowledge Organization Systems)
• Support for search, esp. facet-based search • for any collection indexed by a participating KOS• for free-text search
• Assistance in cataloging (metadata creation) by catalogers or users (social tagging)
• Long-range goal: Web service where a KOS can be uploaded and mappings to specified target KOS are returned
5
KOS Concept Hub
• The backbone of the proposed system is a faceted core classification of atomic concepts together with a set of relationships
• Interoperability is achieved by expressing concepts from all participating KOS as a canonical representation:description logic formula using atomic concepts and relationships
• Mapping from KOS to KOS is achieved by reasoning over these canonical representations
6
Hub
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH
Shipping
Inland water transport
Merchant marine
Harbors
GermanHafen
Mapping through a Hub
7
Method: How to get DL formulasKey: Efficient creation of canonical representations (DL formulas)
• Apply existing knowledge:Large knowledge base ▬► less effort for processing a new KOS
• Use knowledge of KOS structure for hierarchical inheritance
• Use linguistic analysis of terms and captions
• Eliminate redundant atomic concepts
• Check or produce mapping results from assignment of concepts to the same records
• Get human editors’ input and verification where needed through a user-friendly interface
• KOS “owners” may verify and edit data pertaining to their KOS
8
Knowledge base
Requires an ever larger classification and lexical knowledge base containing many kinds of data:
1. A faceted classification of atomic conceptsSeeded from sources with well-developed facets such as the AOD Thesaurus, the Harvard Business Thesaurus, the Art and Architecture Thesaurus, various ontologies
2. Linguistic knowledge bases such as Wordnet and mono-,bi-, and multi-langual dictionaries and thesauri
3. Many KOS (Knowledge Organization Systems), such as LCC, DDC, DMOZ directory, LCSH, Gene Ontology, Schlagwortnormdatei
4. These will over time be fused into one large multilingual knowledge base with many terminological and translation relationships and relationships linking terms to concepts, with an increasing number of concepts semantically represented by a DL formula.
9
Examples of derivingDL formulas
10
L00 Transportation and trafficL10 Traffic system components
L13 Traffic facilities L15Traffic stations
L17 Vehicles
L30 Modes of transportation L33 Air transport
L37 Water transport
P00 Buildings, constructionP23 BuildingsP27 Architecture P43 Construction
R00 EngineeringR30 AcousticsR37 Soundproofing
T70 Military vs. civilian T73 Military T77 Civilian
Underlying faceted classification
11
HE Transportation
HE550-560 Ports, harbors, docks, wharves, etc.
L00 Transportation and traffic T77 Civilian⊓
Inherited: L00 Transportation and traffic T77 Civilian⊓
Added by editor:L15 Traffic stations L37 Water transport⊓
Resolved to:L15 Traffic stations L37 Water transport ⊓ ⊓T77 Civilian
Method: Assigning atomic concepts 1
12
NA6300-6307 Airport buildings From database already established:
Airport = L15 Traffic stations L33 Air transport ⊓
Buildings = P23 Buildings
Added by editor T77 Civilian
Resolved to
L15 Traffic stations L33 Air transport ⊓ ⊓
P23 Buildings T77 Civilian⊓
Method: Assigning atomic concepts 2
13
TL681.S6 Airplanes. Soundproofing From database already established:
Airplane = L17 Vehicles L33 Air transport ⊓
Soundproofing = R37 Soundproofing
Added by editor: Nothing
Resolved to
L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing
Method: Assigning atomic concepts 3
14
Aeroplanes-Soundproofing From database already established:
Aeroplanes = Airplane [Spelling variant]
ThereforeTerm is recognized as same asAirplanes. Soundproofing
Resolved to
L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing
Method: Assigning atomic concepts 4
15
Any class formed by geographical subdivision
Such as
NA6300-6307 Airport buildings
NA6305.E3 Egypt
Recognized using a dictionary of geographical names
Inherits from subject class above it; simply add the country
L15 Traffic stations L33 Air transport ⊓ P23 Buildings T77 Civilian ⊓ ⊓ Egypt⊓
No editor checking needed
Method: Assigning atomic concepts 5
16
Examples from the resulting knowledge base
17
HE550-560 Ports, harbors, docks, wharves, etc.
NA2800 Architectural acoustics
NA6300-6307 Airport buildings
NA6330 Dock buildings, ferry houses, etc.
TC350-374 Harbor works
TH1725 Soundproof construction
TL681.S6 Airplanes. Soundproofing
TL725-726 Airways (Routes). Airports and landing fields. Aerodromes
VA67-79 Naval ports, bases, reservations, docks
VM367.S6 Submarines. Soundproofing
= L15 Traffic stations L37 Water transport ⊓ T77 Civilian⊓
= P27 Architecture R30 Acoustics⊓
= L15 Traffic stations L33 Air transport ⊓ ⊓P23 Buildings T77 Civilian⊓
= L15 Traffic stations L37 Water transport ⊓ P23 Buildings T77 Civilian⊓ ⊓
= L15 Traffic stations L37 Water transport ⊓ R00 Engineering T77 Civilian⊓ ⊓
= P23 Buildings P43 Construction ⊓ ⊓R37 Soundproofing
= L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing
= L13 Traffic facilities L33 Air transport ⊓ ⊓Technical aspects
= L15 Traffic stations L37 Water transport ⊓ T73 Military⊓
= L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing T73 Military ⊓ ⊓Underwater
18
Aeroplanes-Soundproofing
Airports-Buildings
Buildings-Soundproofing
Ships-Soundproofing
= L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing
= P23 Buildings L15 Traffic stations ⊓ ⊓L33 Air transport
= P23 Buildings P43 Construction ⊓ ⊓R37 Soundproofing
= L17 Vehicles L37 Water transport R37 ⊓ ⊓Soundproofing
LC subject headings with combinations of atomic concepts
19
Hub
L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing
L17 Vehicles ⊓ L37 Water transport ⊓ R37 Soundproofing
L17 Vehicles ⊓ L37 Water transport ⊓ R37 Soundproofing T73 ⊓Military⊓ Underwater
LCC
TL681.S6 Airplanes. Soundproofing
VM367.S6 Submarines. Soundproofing
LCSH
Aeroplanes-Soundproofing
Ships-Soundproofing
Mapping through a Hub
20
Hub
Canonical form of query
(DL formula)
User query
Free text
Combination of elemental concepts through facets (guided query formulation)
Controlled term(s) from a KOS, possibly found through browsing a KOS
Final query
(Enriched) free text query
Query in terms of a KOS
Mapping user queries
21
TL681.S6 Airplanes. Soundproofing
VM367.S6 Submarines. Soundproofing
Aeroplanes-Soundproofing
Ships-Soundproofing
[L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing]
[L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing Military]⊓
[L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing]
[L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing]
Query:L17 Vehicles AND R37
Soundproofing
22
Examples from NALT and LCSH
• NALT National Agricultural Library Thesaurus
• LCSH Library of Congress Subject Headings
23
Air pollution laws
LCSH term
Air – Pollution – Laws and regulations
[isa] Legal rule [appliedTo] {[isa] Condition [isConditionOf] Air [causedBy] Pollutant [property] Undesirable}
NALT terms
Air pollution
[isa] Condition [isConditionOf] Air [causedBy] Pollutant [property] Undesirable
Laws and regulations
[isa] Legal rule
Mapping LCSH ▬► NALT
Air – Pollution – Laws and regulations ▬► Air pollution AND
Laws and regulations
Interpretation for indexing and searching in both directions
24
Soil moisture vs. Soil water
LCSH term
Soil moisture
[isa] Water [containedIn] Soil
NALT term
Soil water
[isa] Water [containedIn] Soil
Mapping LCSH ▬► NALT
Soil moisture ▬► Soil water
25
Greenhouse gardening
LCSH term
Greenhouse gardening
[isa] Gardening [inEnvironment] Greenhouse [inEnvironment] Home
NALT terms
Home gardening
[isa] Gardening [inEnvironment] Home
Greenhouse
[isa] Greenhouse
Mapping LCSH ▬► NALT
Greenhouse gardening ▬► Home gardening AND
Greenhouse
26
Salad greens
LCSH term
Salad greens
[isa] Green leafy vegetable [usedFor] Salad
NALT term
Green leafy vegetables
[isa] Green leafy vegetable
Mapping LCSH ▬► NALT
Salad greens ▬► BT Green leafy vegetables
27
Emerging diseases
LCSH term
Emerging infectious diseases
[isa] Disease [hasProperty] Infectious [hasProperty] Emerging
NALT term
Emerging diseases
[isa] Disease [hasProperty] Infectious ??? [hasProperty] Emerging
Mapping LCSH ▬► NALT
Emerging infectious diseases ▬► Emerging diseases
Emerging infectious diseases ▬► BT Emerging diseases
28
Distributed implementation
• A KOS on the Web could assign DL formulas to its concepts − let's call this a semantically enhanced KOS or SEKOS
• Could use any of a number of faceted core classifications or even several (using a unique URI for each elemental concept)
• Core classifications could be mapped to each other
• It is now a simple matter to map from any SEKOS to any other (somewhat dependent on the core classifications used)
29
Take-home message
Semantics gives powerful systems
Semantik schafft maechtige Systeme
30
L
• C
31
This project will achieve the followingInteroperabil i ty between any two participating Knowledge Organization Systems (KOS) (to the extent the two schemes allow)Facet-based search
for any collection indexed by a participating KOSfor free-text search
Assistance in cataloging (metadata creation) by catalogers or users (social tagging)Long-range goal: Web service where a KOS can be uploaded and mappings to specified target KOS are returned
MeansCreate a comprehensive knowledge base relating many classification schemes and subject heading lists used in libraries and in other contexts (LCC, DDC, DMOZ directory, LCSH, European schemes).Use combinations of atomic concepts taken from a well-structured underlying faceted classification to represent the meaning of classes and subject headings.
32
33
Hub
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH
Shipping
Inland water transport
Merchant marine
Harbors
GermanHafen
Mapping through a Hub
34
HubLCC
LCSH
Mapping through a Hub
Koeln 20090706• Themen• Role indicators for building themes• arrangement of themes for exploration under user control• carry-over from citation order• Practical problem of connection to the participating systems – should use IDs for
combinations in Hub. Make sure that hub stays consistent with participating systems.
35