m.l.zeng @ issai, helsinki,2007 4
TRANSCRIPT
1
Introductory Review ofCurrentKnowledge OrganizationSystems/Structures/Services(KOS)
Marcia Lei ZengSecond International Seminar on SubjectAccess to Information, Helsinki,Finland, 29-30 November 2007
M.L.Zeng @ ISSAI, Helsinki,2007 2
Purpose of this talk
• Introduce different types ofknowledge organizationsystems/structures/services(KOS)
• Provide a commonterminology and background
M.L.Zeng @ ISSAI, Helsinki,2007 3
1. KOS overview (1)
Knowledge organizationsystems/structures/services(KOS) encompass all types ofschemes for organizinginformation and promotingknowledge management.– (Gail Hodge, 2000)
M.L.Zeng @ ISSAI, Helsinki,2007 4
1. KOS overview (2)
These systems• model the underlying semantic
structure of a domain, and• provide semantics, navigation, and
translation through labels,definitions, typing, relationships,and properties for concepts.– (Hill et al. 2002, Koch and Tudhope 2004).
A Taxonomy of KOS
Term Lists:Authority Files
Synonym Rings
Classification &Categorization:
Subject Headings
Classification schemesTaxonomies
Categorization schemes
Relationship Models: OntologiesSemantic networks
Thesauri
Glossaries/DictionariesPick lists
GazetteersDirectories
Metadata-likeModels:
Function
Structure
M.L.Zeng @ ISSAI, Helsinki,2007 6
2. Fundamentals of KOS Approaches
• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or
equivalents• 2.3 Making explicit semantic
relationships– Hierarchical relationships– Hierarchical + other associate
relationships• 2.4 Presenting relationships as well
as properties of concepts
2
M.L.Zeng @ ISSAI, Helsinki,2007 7
2.1 Eliminating ambiguity
• Ambiguity: terms having thesame spelling (homographs)that represent differentconcepts or meanings
• Ambiguity exists when a giventerm can be used to representcompletely different concepts.
Ambiguity / Homographs
Source: Z39.19-2005, p.25
M.L.Zeng @ ISSAI, Helsinki,2007 9
To eliminate ambiguity (1)
1. Adding a qualifier to a term-- one of the major methods used
by almost every type of KOS,especially lists of subjectheadings and thesauri.
• e.g., Mercury (automobile)
M.L.Zeng @ ISSAI, Helsinki,2007 10
2. Providing a scope note-- another major method used by
almost every type of KOS,especially lists of subjectheadings, classifications, andthesauri.
To eliminate ambiguity (2)
Screenshot from MeSHhttp://www.nlm.nih.gov/mesh/MBrowser.htmlEntry: mercury
M.L.Zeng @ ISSAI, Helsinki,2007 11
http://www.nlm.nih.gov/mesh/MBrowser.html
M.L.Zeng @ ISSAI, Helsinki,2007 12
To eliminate ambiguity (3)
3. providing a context of a term
3
M.L.Zeng @ ISSAI, Helsinki,2007 13
What are these?
• Flying Horse• King Fisher• Royal Challenge• Heineken• Budweiser• Miller-Lite• Bud-Light
Drinks• Flying Horse• King Fisher• Royal Challenge• Taj Mahal• Hayward’s 2000• Heineken• Corona• Budweiser• Miller-Lite• Bud-Light
Lists (Picklists)A type of controlled vocabulary induced in
NISO Z39.19 Standard
M.L.Zeng @ ISSAI, Helsinki,2007 16
• Lists are used to describe aspects of contentobjects or entities that have a limited number ofpossibilities.
• Examples include:– geography (e.g., country, state, city),– language (e.g., English, French, Swedish),– format (e.g., text, image, sound), or– … …
M.L.Zeng @ ISSAI, Helsinki,2007 17
Lists can be used effectively forboth browsing and searching.
• In browsing, items are directlyaccessed when the list of termsis reviewed and one term isselected
M.L.Zeng @ ISSAI, Helsinki,2007 18
Source: http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml
4
M.L.Zeng @ ISSAI, Helsinki,2007 19
• In searching, a list may beused to access content in asingle term search, or the termsfrom the list may be used tolimit a retrieved set by anotherattribute of interest for the user(one or more terms in thesearch).
M.L.Zeng @ ISSAI, Helsinki,2007 20
Source: Google’s advanced search http://www.google.com
pick lists
Waterford County Image Archivehttp://www.waterfordcountyimages.org
M.L.Zeng @ ISSAI, Helsinki,2007 22
Waterford County Image Archivehttp://www.waterfordcountyimages.org
M.L.Zeng @ ISSAI, Helsinki,2007 23
List - Definition, Purpose, and Uses• A list (also called a pick list) is
a limited set of terms arrangedas a simple alphabetical list orin some other logically evidentway.– A list is a series of terms in some
sequential order.– Terms can be ordered
alphabetically, chronologically,numerically, etc.
Exercise: Which list isbetter?
5
M.L.Zeng @ ISSAI, Helsinki,2007 25
• The defining characteristics ofa list are that the terms:· are all members of the same set
or class of items (e.g., countries,products)
· are not overlapping in meaning· are equal in terms of specificity
(granularity)
M.L.Zeng @ ISSAI, Helsinki,2007 26
Typical applications
• Lists are frequently used todisplay small sets of terms thatare to be used for quitenarrowly defined purposessuch as a web pull-down list orlist of menu choices.
M.L.Zeng @ ISSAI, Helsinki,2007 27
2. Fundamentals of KOS Approaches
• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or
equivalents• 2.3 Making explicit semantic
relationships– Hierarchical relationships– hierarchical + other associate
relationships• 2.4 Presenting relationships as
well as properties of concepts
M.L.Zeng @ ISSAI, Helsinki,2007 28
2.2 Controlling synonyms orequivalents• Synonyms: terms with the same or
similar meanings1. True synonyms (unusual)
– mean exactly the same thing and areused in precisely the same context
2. Near synonyms (most common)
M.L.Zeng @ ISSAI, Helsinki,2007 29
1. True Synonyms• common and technical names
– salt vs. sodium chloride• changes in usage of terms over time
– electronic calculating machines vs.computers
• in different languages– eyeglasses, spectacles, glasses
• acronyms– BBC, British Broadcasting Company;
MPG, miles per gallon• variant spellings:
– cancelled, canceled; honor, honour
M.L.Zeng @ ISSAI, Helsinki,2007 30
2. Near Synonyms
• Same stem– computing, computers,
computed,microcomputers,supercomputers
• Overlapping concepts– medicine, drugs– fired, laid off– forest, woods– arid, dry
• General andspecific termsCoffee– Double Espresso– Latte– Cappuccino– Short Black– Macchiato– Flat White– etc.
6
M.L.Zeng @ ISSAI, Helsinki,2007 31
Synonymy
Source: Z39.19-2005, p.25M.L.Zeng @ ISSAI, Helsinki,2007 32
• Each distinct concept shouldrefer to a unique linguisticform.
• Information or content that isprovided to a user should notspread across the system undermultiple access points, butshould be gathered together inone place.
… …150 World War, 1939-1945450 European War, 1939-1945450 Second World War, 1939-
1945450 World War 2, 1939-1945450 World War II, 1939-1945450 World War Two, 1939-1945
Source: FAST: FacetedApplication of SubjectTerminologyhttp://fast.oclc.org/
Controlling synonyms: there will only be one term used to representa given concept or entity.
or:
World War, 1939-1945UF European War, 1939-1945UF Second World War, 1939-1945UF World War 2, 1939-1945UF World War II, 1939-1945UF World War Two, 1939-1945
European War, 1939-1945USE World War, 1939-1945
Second World War, 1939-1945USE World War, 1939-1945
World War 2, 1939-1945USE World War, 1939-1945
World War II, 1939-1945USE World War, 1939-1945
World War Two, 1939-1945USE World War, 1939-1945
AuthorityFile
Thesaurus
M.L.Zeng @ ISSAI, Helsinki,2007 34
Source: Art and ArchitectureThesaurus (AAT)
M.L.Zeng @ ISSAI, Helsinki,2007 35
Source: Medical Subject Headings (MeSH)
Synonym RingsA type of controlled vocabulary induced in
NISO Z39.19 Standard
7
astronaut
spaceman cosmonaut
spationaut taikonaut
A synonym ring connects a set of words that aredefined as equivalent for retrieval.
An example from International SEMATECH.
A search for Silicon would look like this:
Your search was submitted as “CILICON” or “SI”
M.L.Zeng @ ISSAI, Helsinki,2007 39
Synonym Rings are used--• to expand queries for content objects
– If a user enters any one of these terms asa query to the system, all items areretrieved that contain any of the termsin the cluster.
• in systems where the underlyingcontent objects are left in theirunstructured natural languageformat– The control is achieved through the
interface by drawing together similarterms to these clusters.
• in conjunction with search engines
Poverty mitigation
Poverty alleviation
Poverty elimination
Poverty reducation
Poverty eradication
Poverty abatement
Poverty prevention
Poverty reduction
Rings can include all kinds ofsynonyms - true,misspellings, predecessors,abbreviationsSource: Bedford, 2006 ppt.
M.L.Zeng @ ISSAI, Helsinki,2007 41
Exercise
• Find synonyms of this type ofobject:
M.L.Zeng @ ISSAI, Helsinki,2007 42
2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or
equivalents• 2.3 Making explicit semantic
relationships– Hierarchical relationships– hierarchical + other associate
relationships• 2.4 Presenting relationships as well
as properties of concepts
8
M.L.Zeng @ ISSAI, Helsinki,2007 43
2.3 Making explicit semantic relationships –Hierarchical relationships
BirdsCardinalsDovesRobinsWrens
All specific names ofbirds are kinds of birds.
Phylum: ChordataClass: Reptilia
Subclass: AnapsidaOrder: Testudines
Suborder: CryptodiraFamily: Dermochelyidae
Genus: DermochelysSpecies: Dermochelys coriacea
(Leatherback turtle)
Scientific TaxonomyAn example: Leatherback turtle
M.L.Zeng @ ISSAI, Helsinki,2007 45
superordinate classes (e.g., parents). coordinate classes (e.g., siblings)
. . subordinate classes (e.g., children). . subordinate classes
. coordinate classes
. coordinate classes. . subordinate classes
relationship types: generic, instance, and whole-part
Classifications
M.L.Zeng @ ISSAI, Helsinki,2007 46
M.L.Zeng @ ISSAI, Helsinki,2007 47
Part / WholeCause / EffectProcess / AgentAction / ProductAction / PatientConcept or Thing / PropertiesConcept or Thing / OriginsThing or Action / Counter-agentRaw material / ProductAction / Property
Antonyms
Bicycle / Bicycle WheelAccident / InjuryVelocity measurement / SpeedometerWriting / PublicationTeaching / StudentSteel alloy / Corrosion resistanceWater / WellPest / PesticideGrapes / WineCommunication / Communication
skillsSingle people / Married people
Relationship Example
2.3 Making explicit semantic relationships –Associative relationships (not hierarchical)
9
M.L.Zeng @ ISSAI, Helsinki,2007 49 M.L.Zeng @ ISSAI, Helsinki,2007 50
Source: Z39.19-2005, p.29
KOS in Use at World Bank
• Topic Thesaurus (500,000+English terms, French andSpanish language versions inprogress now)
• Topic Classification Scheme(30 top classes, 700+ subtopics,300+ subsubtopics)
• Business Function Thesaurus(50,000 terms and growing)
• Business FunctionClassification Scheme (5business areas, 30 lines ofbusiness, 300+ businessprocesses)
• Country-Region classificationscheme (6 regions, ca. 200countries)
• Content Type ClassificationScheme (8 content types, 300+secondary content types – inrefinement now)
• Media-Format ClassificationScheme
• Country Name Authority Control(synonym, predecessor, successorsources)
• Edition Statements AuthorityControl
• Publisher Name AuthorityControl
• Organization Authority Control• Language Authority Control• Series Name/Collection Title
Authority Control• Translation Type Authority
Control
Source: Bedford, 2007, ASIST
M.L.Zeng @ ISSAI, Helsinki,2007 53
Pick lists Hierarchicaltaxonomy
SynonymRings
SynonymRings
Vision of An Enterprise Advanced Search
Source: Revised based on Bedford, 2006 ppt.
M.L.Zeng @ ISSAI, Helsinki,2007 54
Synonym Rings
Thesaurus
Metadata
Source: Revised based on Bedford, 2006 ppt.
10
2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or
equivalents• 2.3 Making explicit semantic
relationships– Hierarchical relationships– hierarchical + other associate
relationships
• 2.4 Presentingrelationships as well asproperties of concepts M.L.Zeng @ ISSAI, Helsinki,2007 56
2.4 Presenting relationships aswell as properties of concepts
• Entity types• Relationship types• Properties
M.L.Zeng @ ISSAI, Helsinki,2007 57
Semantic networks
organize sets of termsrepresenting concepts,modeled as the nodes in anetwork of variablerelationship types.
M.L.Zeng @ ISSAI, Helsinki,2007 58
UMLS Semantic Network
135 Semantic Types (link) and 54 Semantic Relation Types (link)
Source: Noy, N. F. and Tu, S.W. (2003).
Ontologies
Classes
attributes
instances
11
M.L.Zeng @ ISSAI, Helsinki,2007 61 M.L.Zeng @ ISSAI, Helsinki,2007 62
M.L.Zeng @ ISSAI, Helsinki,2007 63
The Graph view of relations
M.L.Zeng @ ISSAI, Helsinki,2007 64
A Taxonomy of KOS © 2007 Zeng
OntologiesSemantic networks
Thesauri
Glossaries/DictionariesPick lists
xxxxxpresenting properties
xxxxxxxxxestablishingrelationships: associative
xxxxxxx xxxxestablishingrelationships: hierarchical
xxxxxxxxx xxxxxxcontrolling synonymsxxxxxxxxx xxxxxeliminating ambiguity
establishing
xestablishingxxxx
function
Two-dimensions
Term Lists: Synonym RingsFlat
structure
Classification &Categorization:
Subject Headings
Classification schemesTaxonomies
Categorization schemes
Relationship Models:
GazetteersDirectories
Authority Files
Metadata -likeModels:
Multipledimensions
Majo
r fun
ction
s
M.L.Zeng @ ISSAI, Helsinki,2007 66
Networked KOSè NKOS
• KOS are not used in isolation;• KOS may be used, re-used, and re-
purposed in web-based services;• KOS are used for:
– organizing, indexing, cataloging, and searching,AND
– learning, knowledge modeling, reasoning, etc.• NKOS need to be machine-processable,
machine-understandable– (more to discuss later today)
12
M.L.Zeng @ ISSAI, Helsinki,2007 67
References
• Hodge, Gail (2000). Systems of Knowledge Organization forDigital Libraries: Beyond Traditional Authority Files. Washington,DC: Council on Library and Information Resources.http://www.clir.org/pubs/reports/pub91/contents.htmlhttp://www.clir.org/pubs/reports/pub91/pub91.pdf
• Hill, Linda, Buchel, Olha, Janee, Greg, and Zeng, Marcia L.2002. Integration of knowledge organization systems intodigital library architectures: In: Mai, Jens-Erik, et al. ed.:Advances of classification research, volume 13, proceedings of the13th ASIST SIG/CR Workshop, 17 November 2002Philadelphia PA, pp. 62-68.
• Koch, Traugott and Tudhope, Douglas. 2004. User-centredapproaches to Networked Knowledge OrganizationSystems/Services (NKOS): Background.http://www2.db.dk/nkos-workshop/#Background