the richer the record, the better the results options for enriching bibliographic records in your...
TRANSCRIPT
The Richer the Record, The Better the Results Options for enriching bibliographic records in your OPAC
Felicity Dykas, Head of Cataloging,
University of Missouri
Ember Stevens, MLS
2011 MOBIUS Annual ConferenceJune 8, 2011
OPAC Record Enrichment• Literature review: what’s out there?
• Enrichment options: commercial and in-house possibilities
• Record enrichment at MU: what we’re doing
Enrich Records with …• Typically
• Table of contents• Summaries and abstracts• Publishers descriptions, including information from book jackets
Enrich records with …
Why enrich?• “If it is too inconvenient I’m not going after it:” convenience
as a critical factor in information-seeking behaviors (Connaway, 2011)• The concept of convenience can include
• their choice of an information source• their satisfaction with the source and its ease of use• their time horizon in information seeking
Why enrich?
• Users overwhelmingly search by keyword• Records need more than author, title, subject headings• Subject headings contributed an average of 4.84 unique
words, while the contents and summary notes fields contributed an average of 15.50 unique words per record. (Markey, 1987)
• Notes fields add words that may be more current and/or more relevant to a particular discipline
Why enrich?
More data allows users to evaluate the resource• Libraries need to make it easier for end users to quickly
ascertain whether items meet their needs. (Calhoun, 2009)
• … giving users significantly enhanced intellectual access to library materials at the point of searching, regardless of where the search takes place. (Banush, 2002)
User expectations• OPACs have to compete with internet searches, federated
searches, full-text ebooks, etc.• … Google, Amazon, Barnes & Noble
Summon … MERLIN Metadata
Summon … Database Metadata
Research• After adding TOCs/Summary Notes, circulation rose
significantly more than expected. … The percentage increase in circulation after TOCs/Summary Notes were added was 20.40%. (Faiks, 2007)
• The study found that tables of contents do increase usage. … In general, even after adjusting for all the variables (publication date, location, circulation status, subject, and previous use), the odds of a title being used increased by 43% if the titles had online tables of contents. … The largest effect of including a table of contents was for the most recent items. (Morris, 2001)
Research• … this study suggests that content-enriched metadata
overall contribute to higher circulation across the four subject fields. [History, social sciences, language & literature, science & techology] Content-enriched data also play an important role in OPAC discovery. (Tosaka, 2010)
• These data reveal that users’ queries have more matches in TOC than in LCSH. However, users’ queries often failed to find items similar to the target items whether their terms were searched against TOC or LCSH. (Choi, 2007)
Research• … manual metadata enhancements greatly increase the
use of our digital image collections. Enhanced images accounted for quadruple the amount of use as unenhanced images. [Internet searches analyzed using Google analytics] (Chapman, 2011)
Research: Good Overviews• Anderies, J. (2004), “Enhancing library catalogs for
music”, Conference on Music & Technology in the Liberal Arts Environment, June 2004, Hamilton College.
• Byrum, John D. Jr, and David W. Williamson (2006). Enriching Traditional Cataloging for Improved Access to Information: Library of Congress Tables of Contents Projects. Information Technology and Libraries. March 2006. p. 4-11.
• Tosaka, Yuji, and Cathy Weng. Re-examining Content-enriched Access: Its Effect on Usage and Discovery. (2010)
Caveats• Adding keywords: increase in recall; decrease in
precision (Banush)• If TOC is long and contains many entries, does this dilute
the value of the information once it is put into a 505 field? (Byrum, 2006)
• Time consuming and costly to add• Copyright issues?• Enrichment can’t overcome all system limitations
• Displaying content-enriched data in OPAC with matching keywords highlighted is essential in helping users identify the resources they need. (Tosaka, 2010)
Projects to Enhance Records• Library of Congress• OCLC
Library of Congress ProjectsBEAT – Bibliographic Enrichment Advisory Team• Scan and add links to digital information
• Some automated processes• Add publisher supplied information• Add information from other sources
• Annotations for sites selected annually by the MARS Best Free Reference Web Sites Committee
CIP• Cover art• ONIX data including summaries
More information: http://www.loc.gov/catdir/beat/ and Byrum (2006)
LC record in MERLIN (DLC|cDLC|dBTCTA|dYDXCP|dIG#|dBWX)
“Machine-generated contents note:”
“—Provided by publisher.”
Best Free Reference Web Sites
OCLC Projects• Next Generation Cataloging
• Pilot project related to use of upstream metadata and enrichment using publisher and vendor ONIX metadata
• Where possible, will use automated processing• www.oclc.org/partnerships/material/nextgencataloging.htm
• Work record project: Will re-use summaries, contents, subject headings for identical works
OCLC Projects• OCLC has partnered with All Music Guide and Rovi to
enhance records for pop and classical music. Information added includes descriptions, genres, styles, release dates, tracks, ratings, etc. (olac.org/drupal/newsletters/enews/2010Dec/oclcnews.html)
• AllMusic metadata is attached to bibliographic records for sound recordings. Adding third-party enhanced content is an OCLC priority. (MOUG newsletter, June 2011, p. 26, 28)
Some allmusic.com Notes in WorldCat
• 500 Song titles from AllMusic.• 500 "The explicit stated sequel to 1975's masterpiece
'Captain Fantastic & the brown dirt cowboy'"--AllMusic.Com.
• 500 Originally released in 1953 as Jazz funeral in New Orleans--Allmusic.com, accessed 31 Jan. 2011.
• 500 "Jazz-rock, fusion, avant-garde"--www.allmusic.com
• 500 "Alternative metal, Industrial metal, Rap-metal"--allmusic.com
Slide from presentation at the OCLC ARC America Regional Council meeting ALA Midwinter 2011: WorldCat Quality / Karen Calhounhttp://www.oclc.org/multimedia/2011/ARC_ALA2011.htm
Recent Cataloging (MU)
These projects are making a difference for us!
• Recent new books list (monographs)• 355 records total• 67 records have summaries (520) – 19%• 181 records have table of contents (505) – 51%
• Recent FastCat records (monographs, DLC and PCC)• 682 records total• 127 records have summaries (520) – 19%• 520 records have table of contents (505) – 59%
Enrichment Options• Commercial
• Convenient• Costly
• In-house• Flexible• Customizable• Time consuming
Commercial Services• Many backed by Bowker (via Books in Print data)• Usually between $1-$2 per record; may be as cheap as
$.02 per record with Syndetics
Commercial Services: Syndetics• Distributed by Bowker• Table of contents, first chapters/excerpts, summaries,
author notes, reviews, additional media• Boston University, Oklahoma State, University of Chicago• Cost depends on collection size• Doesn’t become permanent part of the record
Boston University
Table of contents
Summary & Review
University of Chicago
Ability to Tag
SummaryTable of contentsBook covers
Commercial Services: Blackwell/YBP• “Tables of Contents Catalog Enrichment Service”• Distributed by Bowker• Acquired by Baker & Taylor/YBP in 2009• TOCs, author notes, and book jacket summaries • Information is permanently added to bibliographic record• UM cost: $1.18 per title (expect this will increase)
MU’s MERLIN
Problems: Promotional Blurbs
"This book provides an excellent overview on opportunities for economic applications of the Information-Gap Theory."
"A must-read for serious economic decision-makers."
Problems: Formatting
All caps
Commercial Services: LibraryThing• Distributed by Bowker• Book recommendations, tagging, other editions, patron
reviews, shelf browse• 20% discount for consortia• University of Denver, Brigham Young University,
Cal State Channel Islands• Info stored on LibraryThing Server, updates in real time• Annual subscription
University of Denver
Virtual ShelfBrowser
Brigham Young University
Similar items
Similar items
Tags
Brigham Young University
Similar items
Reviews
Commercial Services: MARCIVE• Table of contents, summaries, Accelerated Reader
program, Fiction/Biography information, Lexile Meaures, Reading Counts
• Table of Contents: $.50/record• Fiction/Biography enrichment: $.50/record• Summaries: $.30/record• UC Merced, Brown University, U. of Georgia School of
Law
University of California Merced
Table of contents
Abstract & Review
Record Enrichment at MU• Focus is on monographs• Regular practices – at point of cataloging
• Abstracts or summaries for MU theses, dissertations, master’s projects
• Title added entries for music materials• Genre/form terms• Local collection headings• Donor information (Honor with Books, MU Remembers) • TOCs for ebooks using macros (post-cataloging projects, too)
• Regular practices – post cataloging• OCLC bib notification• TOCs for print engineering books (added by public services staff)• Recatalog to analyze volumes when requested by selector
• Commercial: YBP TOCs, summaries, author information
Record Enrichment at MU• Experiments
• Links to WorldCat Identities for authors• Summaries and other information for fiction• Summaries from publishers • Convert TOCs from LC links to TOC in bib records• Descriptions from exhibits and Special Collections pages (pending)
YBP – Purchased Metadata• Record criteria for April 2011 enrichment
• No table of contents (MARC tags 505 or 970s)• Published between 2008 and 2012• Monograph• Not an online book• Not a government publication• No subject headings with “examination questions” or “questions”
• April 2011 enrichment• 21,224 records sent
• 9,701 records enriched (46%)• 7,070 table of contents added (73%)• 5,240 summaries added (54%)• 2,076 author information added (21%)
Record Enrichment at MU:OCLC Bib Notification• We signed up in 2009 to receive reports for records with
new tables of contents and encoding level increases (free) – information is sent based on holdings in WorldCat
• There is an option to receive updated records ($)• http://www.oclc.org/bibnote/default.htm
MU Process• Use WorldCat batch processing to search for records using
OCLC numbers (from reports), to add constant data, and to export records. Records match on OCLC number
• Load table replaces 001 and 019 in existing local record, and inserts 505 and 520 fields• We do not overlay records since we have a shared catalog and we do
not want to use local edits
• Each record is reviewed and duplicate 505s and 520s are deleted• We used to insert subject headings, too, but reviewing for duplication
was too time consuming
• Procedures: mulibraries.missouri.edu/staff/catalogdept/OCLCbibnotification.htm
Sample Report
MU OCLC Bib Notification statistics• Time: Averages about two minutes per enriched record • Statistics: February 2011
Pub dates Processed New content
-1899 21 19
1900-1949 191 186
1950-1979 620 598
1980-1989 132 129
1990-1999 151 95
2000-2009 185 116
2010- 66 61
Total 1366 1204
OCLC Bib NotificationSample information
Enhancing Ebooks with TOC• Question: How do we enhance ebooks with table of
contents?• Next best thing to full-text searching• Due to budget reasons, do not send out e-resources for
enhancement (table of contents, summaries)• Policy not to duplicate print titles • Selectors would appreciate this added service
Solution• Most
vendors provide listings of table of contents
How we did it• Used Microsoft Word macro feature• Copy table of contents• Paste into Microsoft Word (set options to keep text only—
eliminate HTML coding)• Show codes (paragraph symbols, etc.) • Initially: Use a series of find/replace operations to format
TOC correctly and save this into macro• Now: Use Visual basic coding in macro program
Example after pasted into Word
After macro is run: Ready for 505Preface, Sponsors and Organizing Committees -- Effects of
Surface Active Element on the Biocompatibility of High Nitrogen Stainless Steel -- Quench Brittleness of 12%Cr Martensitic Heat-Resistant Steel -- The Formation and Occurrence of Non-Metallic Inclusions of Si-Doped Steel during Continuous Casting -- Microstructure and Mechanical Properties of Molybdenum Alloy Strengthened by Lanthanum Oxide and Silicon -- AZ80 Mg Alloy Synthesized by Spray Forming and its Extrudability -- Effect of Carbon Migration on Sulfide Stress Corrosion Cracking Behavior of Dissimilar Joints in Wet H2S Environment -- Microstructures and Mechanical Properties of Dissimilar Metal Weld A508/52M/316L Used in Nuclear Power Plants
Before Macro
After Macro: Ready to Insert into Millennium
Public display in MERLIN
Advantages• Great keywords, particularly for conference proceedings• Provides user with info about contents• If adding into OCLC records (new or already existing),
enrich the record for everyone• We’re able to provide quality customer service for users
and public services without spending extra (besides human resources)
• Ebook records often better than print (need FRBR-enhanced catalog)
Disadvantages• If adding only to MERLIN record (such as for titles from
record loads) • Only enhances it locally (not internationally)• We use non-standard 970 (but does display better)
• Dependent on browser version (Firefox)• If browser upgrades/changes, then macro may be
affected (re: upgrade to Firefox 4.0)• Still often need to do manual changes after running macro
(diacritics, subscripts, etc.)• Can be time-consuming & requires hard decisions about
what’s most important (cost/benefit ratio).• Accept case of TOC• Sometimes only include titles (not authors)
Record Enrichment at MU:Fiction1) Focused on one publisher: HarperCollins.
• Limited search in MERLIN Catalog to genre = fiction and the publisher
• Searched HarperCollins web site for each title on the list and added book descriptions to MERLIN record (MARC 520) when found (cut and paste)• 520 “Description.”—Publisher’s website.
• Generally the book descriptions were about three paragraphs of text
• Added 47 summaries for books published in 1990s and 2000s• Focusing on one publisher made this a fairly quick process
Fiction: HarperCollinsSample information
Record Enrichment at MU:Fiction2) Added subject headings, summaries, and TOC from
WorldCat• Focus was on titles published in 1980s and 1990s• Searched by OCLC number• If new information was available, added constant data and
exported the record (load table set to only load TOCs, summaries, and subject headings)
• Mostly added subject headings• 110 records searched. 63 had new subject headings, 24 had
summaries, and eight had TOCs. 45 records had no enhancements. (65 records enriched)
• About two hours of time (32 records per hour)
FictionRecord that has not been enriched
FictionRecord that has been enriched
Summary and three subject headings
Record Enrichment at MU:Fiction3) Focus on authors listed on a course syllabus• Worked from a list of authors on a creative non-fiction
writing syllabus• E.g., David Foster Wallace and David Sedaris
• Found records needing enrichment and added information from WorldCat Identities and WorldCat bibliographic records
Fiction: Course SyllabusSample information
WorldCat Identities• David Sedaris:
http://www.worldcat.org/identities/lccn-n94-15692• Information is aggregated from WorldCat records• Includes:
• Overview of works• Genre• Subject headings• Book covers• Time line showing publications about David Sedaris and
publications by David Sedaris• Summaries• Links to LC authority file, VIAF, and Wikipedia entry
Record Enrichment at MU:Fiction4) Added links to WorldCat Identity pages• These pages include information on an author’s works,
genres, subject headings, and a timeline of publications by and about the person
• Added links to bib record (MARC 856) • Created a Millennium macro to speed up the process• 22 added • Downside: added a link; information in not embedded in
the record
Record Enrichment at MU:Special Collections5) Enhanced Special Collections material• Limited to monographs in Special Collections about the
Civil War• Searched WorldCat and the internet to find information• Found a resource at UNC with a lot of information:
Documenting the South. Great summaries and outlines for books. Added links to this site.
• Information added to 20 records
Fiction: Special CollectionsSample information
Record Enrichment at MU:LC links 6) Transferred Library of Congress digital TOCs into
AACR2 format for inclusion in bibliographic records (MARC 505)
• Text was reformatted in Notepad• Took three – ten minutes per record. Using a macro
would speed this up.
Note: “A human cannot compete with an automated process.”
Record Enrichment at MU:Google Books API
Book cover and link to Google Books:* Preview and search inside book* Full text
MU Indexing Rules• Tables of contents
• 970 $t Title and keyword indexes
• Summaries and other notes• Keyword index
• Subject headings• LCSH and MeSH: Subject and keyword indexes• Other subject heading fields: Keyword index
• Genre/Form terms• Genre/form and keyword indexes
Where To Start• Commercial or in-house• Type of content to add• Select material
• By date• Rare books• Fiction• Items being sent to remote storage• Non-major publishers, e.g., self-published titles• Books with chapters by different authors (edited books)• Music material• Theses, dissertations, and other local material• Foreign language material
Where To Start• If in-house, add to WorldCat or just to local catalog(s)• WorldCat Enrichment
• See OCLC Bibliographic Formats and Standards, Chapter 5.3 on Database Enrichment.
• http://www.oclc.org/bibformats/en/quality/default.shtm#database enrichment
• Allow user tags, reviews, and ratings
Where To Get Information• WorldCat• Publisher catalogs and web sites• WorldCat Identifies• Best Free Reference Web Sites• Book reviews
• Other ideas?
Enriched Records …• Researchers can and do use the catalog the way an
entire library is used—not only as a source of material and information, but also as a gateway to additional information. Through adding more keyword-rich information to the catalog, libraries can serve the extended information needs of the researcher as well as offer structured pathways to their own information resources. (Byrum, 2006)
The end
• Thanks to Mary Aycock for sharing information on the ebook macros she created!