why are taxonomies necessary?

Post on 12-May-2015

3.183 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Introduces basic information about what taxonomies (controlled vocabularies) are and why they are important for information finding.

TRANSCRIPT

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Necessary?

By Fred Leise

ContextualAnalysis, LLC

© 2007 by ContextualAnalysis, LLC

Taxonomies are sets of terms (controlled vocabularies or CVs) used to tag documents or other content objects.

Taxonomies may also be used as browsing hierarchies or for search enhancement.

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Taxonomy terms are collected into groups called attributes. Each attribute (or facet) describes one property of your content.

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Example:

Attribute: Office Location

Terms: London

New York City (NYC, Big Apple)Washington, DC

What Are Taxonomies?

Alternate Terms

© 2007 by ContextualAnalysis, LLC

In this example, “NYC” and Big Apple” are given as variants for “New York.”

Variant terms are used to expand search queries. If a user enters “New York” the search system expands to search “New York or NYC or Big Apple.

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Search query expansion ensures that more relevant information is found, even though it might use terms the searcher hasn’t thought of.

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Other typical attributes include:

Author

Creation Date

Audience

Version Number

Subject

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

There is an international standard for metadata, the Dublin Core Metadata Element Set, consisting of 15 attributes.

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Good metadata schemas (collections of attributes) will adhere as closely as possible to the Dublin Core standard.

More information is available at: www.dublincore.org

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Well designed taxonomies:

1. Enable users to find relevant information quickly and efficiently (improved retrieval)

2. Lead users to additional relevant information, providing upselling and cross-selling opportunities

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Well designed taxonomies:

3. Assists authors in consistently tagging content

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

Proper use of taxonomies results in:

Less time wasted searching for information

Fewer failed searches

Fewer abandoned interactions

Increased income

Reduced customer assistance costs

What Are Taxonomies?

© 2007 by ContextualAnalysis, LLC

English is rich in words that mean the same or nearly the same thing

feline/cat

car/automobile

travel/journey/excursion/trip

jeans/denims/Levi's/501s

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Result: scattering of information.

No matter what term you use in a free-text search, you get only part of the relevant information.

The rest is not retrieved because it uses different terms to describe the same concept.

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Consider the example of mobile devices.

There are many ways that users can refer to them:

Personal digital assistants

Handheld computers

Blackberries

PDAs

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

If users don’t know the term you use to label the information they are looking for, they waste time browsing or give up their search completely.

They are victims of a communication chasm.

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

You use the term “cat.” I use “feline.” If we each search a recipe database that uses both terms with equal frequency, we will get back only half the appropriate recipes, a recall ratio of 50%

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Solution: Add a controlled vocabulary to the search system that gives “feline” and “cat” as equivalent terms.

Search queries will be expanded appropriately.

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

English is rich in words that have more than one disparate meaning

Pitch

To throw a baseball

A tar-like substance

A salesman’s monologue

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Bank

Where you store money

The side of a river

To carom a cue ball off a pool table rail

To prepare a fire for the night

To maneuver a plane for a turn

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Result: Lots of false drops (irrelevant information), resulting in poor precision.

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Solution: use a CV that includes scope notes (definitions) or that uses facets.

Example: Think about searching for the term “Rembrandt.” You might get the following results.

Why Are Taxonomies Important?

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Important?

Rembrandt GoSearch

The painter Rembrandtwas one of the greatestof all the Dutch realists….

If you want to whitenand brighten yourteeth, there is no betterbrand than Rembrandt.

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Important?

You probably are interested in only one of these “Rembrandts.” So half of your search results are irrelevant.

Now consider what happens if you were able to specify the type of object you are looking for, either an artist or a toothpaste brand.

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Important?

The painter Rembrandtwas one of the greatestof all the Dutchrealists….

If you want to whitenand brighten yourteeth, there is no betterbrand than Rembrandt.

Artist

Brand Name

Rembrandt

Rembrandt

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Important?

You get only results relevant to what you are interested in.

Here, having search boxes identified by attribute (faceted searching) lets you hone in quickly on the particular information you want.

© 2007 by ContextualAnalysis, LLC

Why Are Taxonomies Important?

You could also use one search and let users filter or narrow results after their search.

© 2007 by ContextualAnalysis, LLC

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Tagging documents for a content management system

Provides administrative metadata to control authoring and publishing processes

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Administrative metadata: example

Document # AuthorDepartment Creation datePublication date Expiration date

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Tagging document contents for a content management system

Provides metadata to support search

Ensures inter-indexer consistency

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Tagging document contents for a content management system

Controls subject scattering

Increases search results relevance: tags “aboutness” not just mentions of a word

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Search engine component

Translates user’s terms into those used to tag items (increases precision and recall)

Offers options for expanding or reducing scope of search using broader or narrower terms

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Search engine component

Differentiates between multiple meanings of terms

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

Taxonomy Use: Search Results

rei.com

© 2007 by ContextualAnalysis, LLC

Roles for Taxonomies

Operating as a browsing hierarchy

Organizes content using taxonomy terms as category labels

Represents taxonomy hierarchy by browsing levels

How are Taxonomies Used?

© 2007 by ContextualAnalysis, LLC

rei.com

Level 1

Level 4Level 3

Level 2

© 2007 by ContextualAnalysis, LLC

Synonym Ring

Identifies words with equivalent meanings (in a given context)

rock = stone

CD-ROM = CD = disk

money = dough = bucks = greenbacks = legal tender

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

Synonym Ring

When one of the words in a synonym ring is searched for, the search engine expands the search and returns items containing any of the words in the ring.

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

Authority File

Has all the features of a synonym ring, plus the identification of preferred terms (approved terms/descriptors/keywords) for tagging content.

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

Taxonomy

Also called hierarchy or classification.

All features of authority files, plus the broader term (BT) and narrower term (NT) relationships.

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

Taxonomy

All terms must be part of a hierarchical relationship (no orphan terms).

Taxonomies may be presented in hierarchical or alphabetical format.

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

total compensation . compensation . . base salary (salary) . . deferred payments (deferred compensation) . . variable pay . benefits . . 401(k) plan . . health benefits . . . dental plan . . . disability insurance

Types of Taxonomies: Taxonomy Example

© 2007 by ContextualAnalysis, LLC

Thesaurus

Plural form: thesauri

All the features of taxonomies, plus the associative relationship of related terms (RT)

Types of Taxonomies

© 2007 by ContextualAnalysis, LLC

Types of Taxonomies:Thesaurus Example, Alphabetical

Building Permits BT Permits

Business Licenses BT Licenses

Business Taxes BT Taxes

Fees RT Taxes

Licenses NT Business Licenses RT Permits

Operating Permits BT Permits

Permits NT Building Permits; Operating Permits RT Licenses

Taxes NT Business Taxes RT Fees

© 2007 by ContextualAnalysis, LLC

Types of Taxonomies:Thesaurus Example, Hierarchical

Vocabulary Terms Related Terms

Licenses, Permits & Taxes    

. Fees   Taxes

. Licenses   Permits

. . Business Licenses  

. Permits   Licenses

. . Building Permits  

. . Operating Permits  

. Taxes   Fees

. . Business Taxes  

© 2007 by ContextualAnalysis, LLC

Synonym Ring

+ preferred terms

= Authority File

+ broader/narrower terms

= Taxonomy

+ related terms

= Thesaurus

Types of Taxonomies—Summary

© 2007 by ContextualAnalysis, LLC

Facets are fundamental categories by which an object or concept may be described

Example: some facets describing a toy ball:

size, weight, shape, color, texture, material

Taxonomies and Facets

© 2007 by ContextualAnalysis, LLC

Uses of Facets: Browsing Hierarchies

Facets allow users to follow the path best matching the way they think (their mental model).

Taxonomies and Facets

© 2007 by ContextualAnalysis, LLC

Uses of Facets: Browsing Hierarchies

Example: epicurious.com > recipes > browse

Main ingredient Cuisine Preparation method Season/occasion Course/dish

Taxonomies and Facets

© 2007 by ContextualAnalysis, LLC

Taxonomies and Facets

epicurious.com

© 2007 by ContextualAnalysis, LLC

Uses of Facets: Fielded Search

Allows for greater specificity, thus increasing search precision.

But this is usually more complicated for users than simple searching, so it is often introduced as option on results page.

Taxonomies and Facets

© 2007 by ContextualAnalysis, LLC

alibris.com Advanced Search

© 2007 by ContextualAnalysis, LLC

epicurious.com Advanced Search

© 2007 by ContextualAnalysis, LLC

Requirements for Browsing/Search Facets

Development of metadata schema

Development of appropriate controlled vocabularies

Proper content tagging

Taxonomies and Facets

© 2007 by ContextualAnalysis, LLC

Aitchison, Jean. Thesaurus Construction and Use: A Practical Manual. 4th ed. Chicago: Fitzroy Dearborn Publishers

Resources

© 2007 by ContextualAnalysis, LLC

Resources

International standard for metadata: Dublin Core Metadata Element Set (ISO Standard 15836-2003)

http://www.niso.org/international/SC4/n515.pdf

© 2007 by ContextualAnalysis, LLC

National Information Standards Organization. ANSI/NISO Z39.19:1993. Guidelines for the Construction, Format and Management of Monolingual Thesauri. Bethesda, MD: NISO Press, 1994

Rosenfeld, Lou, and Peter Morville. Information Architecture for the World Wide Web: Designing Large-Scale Websites. 3d ed. O’Reilly Publishers, 2006.

Resources

© 2007 by ContextualAnalysis, LLC

Sinha, Rashmi. Beyond Cardsorting: Free-listing Methods to Explore User Categorizations

Available at: http://www. boxesandarrows.com/archives/ beyond_cardsorting_freelisting_ methods_to_explore_user_categorizations.php

Steckel, Mike, Karl Fast and Fred Leise. Creating a Controlled Vocabulary. 2002

Available at: http://www.boxesandarrows.com/archives/ creating_a_controlled_vocabulary.php

Resources

© 2007 by ContextualAnalysis, LLC

Contact Information

Fred Leise

www.contextualanalysis.com

fredleise@contextualanalysis.com

@ChicagoIndexer

top related