module 8a: faceted classification imt530: organization of information resources winter 2008 michael...
TRANSCRIPT
Module 8a: Faceted Classification
IMT530: Organization of Information ResourcesWinter 2008Michael Crandall
IMT530 Organization of Information Resources
2
Overview• Enumerated vs. analytico-synthetic
techniques • History of faceted classification –
Ranganathan & CRG• Facets• Examples• Facet analysis • Spiteri’s simplified model• Process • Use of facets in information systems
IMT530 Organization of Information Resources
3
Enumerated Classification vs. Faceted Classification
• Enumerated classification – One-dimensional
• Dewey Decimal Classification System• Library of Congress Classification System
– Rigid hierarchical approach– Gives a single schedule that enumerates
fully the classes and their ready-made class numbers
IMT530 Organization of Information Resources
4
Enumerated Classification vs. Faceted Classification
• Analytico-Synthetic (Faceted Classification)– Multi-dimensional– Clearly defined, mutually exclusive, and
collectively exhaustive aspects, properties, or characteristics of a class or specific subject
IMT530 Organization of Information Resources
5
Faceted Classification
• FA (Facet Analysis) - (analytical technique)• Listing of characteristics of the entities in a
universe (exhaustive, mutually exclusive)
• FC (Facet Classification) - (synthetic structure)• Structure – division of entities in a universe (by
one characteristic at a time) • Synthesis – combination of relevant facets:
– Schedule of terms for description – Assignment of notation
IMT530 Organization of Information Resources
6
Origin of Faceted Classification
– A Hindu mathematician – Worked as a librarian– Started from the limits of
traditional enumerative classification systems
– Attempted to describe the entire universe of ideas
– 1930s– Colon Classification –
Analytico-synthetic classification system (1933)Ranganathan (1892 - 1972)
IMT530 Organization of Information Resources
7
Basic Ranganathan
• Analyze each document • Group isolates (simple-concept subjects) into
the facets• Order the isolates within the facets• Establish a citation order for facets
(Ranganathan’s is P-M-E-S-T)• Establish a schedule order for the facets• Apply the notational system• Compile schedules & generate an index
IMT530 Organization of Information Resources
8
Ranganathan’s 5 Facets
• Personality : Who • Matter: What• Energy: How• Space: Where• Time: When
IMT530 Organization of Information Resources
9
Ranganathan’s Rule
• 46 Canons : Must follow
• 13 Postulates : Strongly recommended
• 22 Principles : Strongly recommended
IMT530 Organization of Information Resources
10
Examples
• Ranganathan's Colon Classification– research in the cure of tuberculosis of lungs
by x-ray conducted in India in 1950– L,45;421:6;253:f.44'N5– Medicine,Lungs;Tuberculosis:Treatment;X-
ray:Research.India'1950
IMT530 Organization of Information Resources
11
Classification Research Group (CRG)
• UK (1952)
• Produced classification systems for narrower, specialized areas
• Designed several subject-specific faceted classification systems
IMT530 Organization of Information Resources
12
Ranganathan & the CRG
• Agree about the essential qualities of a facet – Mutually exclusive; each facet represents a
characteristic not found in any other facet– Relationships between facets are non-
hierarchical
IMT530 Organization of Information Resources
13
Facets
• The broad categories into which the subject area is divided. A facet consists “... of a group of terms that represents one, and only one, characteristic of division of a subject field....no two facets may contain terms that could represent the same concepts.” Spiteri, L. (1998) A Simplified Model for Facet Analysis. Canadian Journal of Information and Library Science v23, 1-30 (April-July 1998). http://iainstitute.org/pg/a_simplified_model_for_facet_analysis.php
• “Clearly defined, mutually exclusive, and collectively exhaustive aspects, properties or characteristics of a class or specific subject" Maple, A. (1995) Faceted Access: A Review of the Literature http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html
IMT530 Organization of Information Resources
14
Examples of Facets
• Petersen (1994) – the Art & Architecture Thesaurus – Associated Concepts/ Physical Attributes/
Styles and Periods/ Agents / Activities/ Materials/ Objects
• Business – Products/ Applications/ Organizations/
People/ Domain objects/ Events/ Publications
IMT530 Organization of Information Resources
15
Epicurious.com
• Recipe collection on the web– Cuisine– Special considerations– Meal/Course– Dish– Main Ingredients– Preparation methods– Season/Occasion
IMT530 Organization of Information Resources
16
Examples
• Faceted Classification – Epicurious http://www.epicurious.com/recipes/find/advanced/ – Wine.com http://www.wine.com/wineshop/– Flamenco http://flamenco.berkeley.edu/demos.html – Images of England http://www.imagesofengland.org.uk– lawforwa.org http://www.lawforwa.org/search/advsearch.html – FAT-HUM Project http://www.ucl.ac.uk/fatks/php/browse.php
• Tools– FacetMap’s Wine demonstration
http://facetmap.com/download/starterKit.jsp– Siderean software http://www.siderean.com/– Endeca software http://endeca.com/
IMT530 Organization of Information Resources
17
Characteristics of a Faceted Classification System
• Based on the important, essential or persistent characteristics of content objects
• More than hierarchies• Easy to extend by adding a new facet • Flexibility• Easier to construct• Easy to formulate composite subjects• Easy to accommodate new concepts• Provides multiple access points to content
IMT530 Organization of Information Resources
18
Facet Analysis
•“Facet analysis is a mental process involving analysis of a subject into its facets based on a set of postulates, canons and principles. It provides a framework to accommodate various types of terms, along with rules for their combination.”—K. Kumar •Facet analysis is the sorting of terms in a given field of knowledge into “homogenous, mutually exclusive facets, each derived from the parent universe by a single characteristic of division ... every distinctive logical category should be isolated, every new characteristic of division should be clearly indicated.” —B. C. Vickery
IMT530 Organization of Information Resources
19
Planes of Work
• The Idea Plane – the process of analyzing a subject field into its component parts
• The Verbal Plane – the process of choosing appropriate terminology
• The Notational Plane – the process of expressing these component parts by means of a notational device
IMT530 Organization of Information Resources
20
Idea Plane: Principles for the Choice of Facets
• Differentiation– Divide by a clearly defined characteristic of division – E.g., Humans by Gender
• Relevance – Reflect the purpose and scope of the classification system – E.g., Children by Grade, but not for Dogs
• Ascertainability– Definite and ascertainable facts– E.g., Date of birth for Humans, Breed for Dogs
• Permanence– Permanent qualities of the entity – E.g., Color wouldn’t work for chameleons
IMT530 Organization of Information Resources
21
Idea Plane: Principles for the Choice of Facets
• Homogeneity
– Facets must not overlap
– E.g., geography and product names
• Mutual Exclusivity – Facets represent only one characteristic
• Fundamental Categories – Categories should be derived from the domain– Disagrees with Ranganathan’s universal PMEST
IMT530 Organization of Information Resources
22
Idea Plane: Principles for the Citation Order of Facets and Foci
• Relevant Succession – Chronological Order– Alphabetical Order– Spatial/Geometric Order– Simple to Complex Order (or Complex to Simple)– Canonical Order– Increasing Quantity (or Decreasing Quantity)
• Consistent Succession
IMT530 Organization of Information Resources
23
Principles for the Verbal Plane
• Context – Meaning is determined by position in the
system Grain dishes
Rice dishesWhite rice dishes With raisinsBrown rice dishes
• Currency – Should use terminology appropriate for the
language in use at time of indexing
IMT530 Organization of Information Resources
24
Guidelines for Faceted Classification
• Study the domain – (Context) Examine the domain – (Content) Study information objects– (Users) Who? Information Needs?
• Entity listing• Facet creation • Facet arrangement• Citation order• Classification• Revision, testing, and maintenance
IMT530 Organization of Information Resources
25
Use of the Facet Approach
• Traditional Use– Classification– Thesaurus– Indexing
• Information Systems– Information Architecture & User-Centered Design
• Navigation and browse
– Information Retrieval • Individual facets can be accessed and retrieved either
alone or in any desired combination
IMT530 Organization of Information Resources
26
Faceted Approach in IS
• Simple in structure• Flexible in application• Amenable to software applications• Amenable to computer assisted indexing
and validation• Interoperable with the majority of
modern indexing vocabularies• Easier and more economical to maintain
than enumerated vocabularies
IMT530 Organization of Information Resources
27
Faceted Navigation
IMT530 Organization of Information Resources
28
Facet-based Advanced Search
IMT530 Organization of Information Resources
29
Combined
IMT530 Organization of Information Resources
30
Usability Testing of Faceted Approaches
• Flamenco developed by Marty Hearst and others to test facets for images
• Built an interface to support both direct search and browsing
• Supports search usability guidelines • Nine facets• Opening – Middle Game – Endgame• http://bailando.sims.berkeley.edu/
flamenco.html
IMT530 Organization of Information Resources
31
Usability Studies – Flamenco (Hearst et al., 2003)
• Usability Study for Search• 32 art history students • Search by Faceted Metadata vs Baseline• More Successful• More usage time• 90% - Preferred the metadata approach overall• 97% - Helped users learn more about the collection• 75% - More flexible• 72% - Easier to use
IMT530 Organization of Information Resources
32
Problems of Faceted Approach
• Mismatched labeling
• Inconsistent category metadata
• Difficulty in deciding on the correct or appropriate facet
• Challenges in defining a useful and usable collection of facets
IMT530 Organization of Information Resources
33
Recap
• Enumerated classification (Hierarchical)– One-dimensional
• Dewey Decimal Classification System• Library of Congress Classification System
– Rigid Hierarchy– Gives a single schedule that enumerates fully the
classes and their ready-made class numbers.
• Analytico-Synthetic (Faceted Classification)– Multi-dimensional– Clearly defined, mutually exclusive and collectively
exhaustive aspects, properties, or characteristics of a class or specific subject
IMT530 Organization of Information Resources
34
Recap
• Hierarchical and faceted approaches are not mutually exclusive– You can use hierarchies under facets to help with
entry vocabulary and cross references
• You may not always be able to apply mutual exclusion and exhaustivity to facets, but you should use these principles to help clarify – Spiteri’s Idea Plane is where you do this work– Try to apply terms from all facets to each object
(webpage) you’re tagging to see what happens– If it doesn’t make sense, you probably need to
rethink your facets
IMT530 Organization of Information Resources
35