shirley rodgers james jackson sanborn
DESCRIPTION
Enhancing Access to Databases: MultiSearch and Database Relevancy— the Integration of Two Collaborative Projects. Shirley Rodgers James Jackson Sanborn. Database Access Problems. Locating and selecting appropriate database - PowerPoint PPT PresentationTRANSCRIPT
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases:MultiSearch and Database Relevancy—the Integration of Two Collaborative Projects
Shirley Rodgers
James Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Database Access Problems
• Locating and selecting appropriate database
• Multiple searches through multiple database interfaces resulting in multiple result sets
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Old Database Approach
Access to databases was clunky and non-intuitive.
– Alphabetical list
– Subject lists that were long and also alphabetical
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Old Subject Page
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Multiple Search Problem
Database
interface
search
Database
interface
search
Database
interface
search
Etc.
interface
search
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Multiple Search Problem
Patrons demanded solution– Old “Locate Databases by Keywords” – 79% of searches failed (>6k)
• geodesic domes• stem cell and optical nerve• goat milk spider silk• factors that explain marital happiness when
spouse lives in nursing home
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Two Problems, Two Solutions
Database Relevancy Project
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Database Relevancy
Goals:
– Intuitive display of databases
– Improved subject access
– Maintainable solution
– Leverage existing data
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Database Relevancy
Plan:– Sort databases by relevancy within subject
area
– Provide additional information for databases ‘important’ to a subject area
– Automatically generate lists
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Technical Details
Data:
– Drawn from catalog• MyLibrary Subject Headings (690 $x)• Descriptive notes (520 $a)• URL (856 $u)
– Three levels of relevancy assigned • Core, Narrow, Broad (690 $R)• Assigned at the subject level
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
MARC transformed to XML using Perl Module MARCPM
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
XML transformed multiple times using XSLT - processed through Saxon, called by brief Perl scripts.
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Why XML
• Much easier to manipulate using XSLT than using Perl to directly manipulate MARC
• Simpler to use than importing MARC into a 2nd database and using ColdFusion
• Easy to test on desktop then move to production
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Limitations of XML/XSLT
• Multiple versions of MARC.XML
• XSLT has limited string processing functionality
• Need Perl to handle multiple file generation based on hash value pairs
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Detail of Record
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Detail of Record
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Collaboration
• Stakeholders brought in early
• Subject specialists from Collection Management and Reference – Gave input on “look and feel” issues and
functionality– Given final say on database relevancy
• Technical development in DLI and Systems departments
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Two Problems, Two Solutions
MultiSearch Project
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
The beginning of MultiSearch
• BlueAngel MetaStar for indexing in-house collections and GIS
• Wanted to learn Java and JSP
• Testing it with other Z39.50 servers
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
The beginning of MultiSearch
• Prototype of cross-searching 2 major database vendors
• How many vendors support Z39.50?
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How can I use what I prototyped?
• Static list of databases – subject and alphabetical
• Database relevancy pages created using XML/ XSLT
• JSP can access XML files
• The projects came together!
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is XML used?
• JSP Xtags can access XML files
• Subject pages use XML and XSLT to display information
<xtags:style xml='<%=xmlfile%> ‘xsl='<%=xslfile%>'/>
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Databases listed using XML/XSLT
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is XML used?
• List of databases to search created from parsing XML file
<xtags:forEach select="//record">
<xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/>
<xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/>
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is XML used?
• List of databases to search created from parsing XML file
<xtags:forEach select="//record">
<xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/>
<xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/>
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is XML used?
• List of databases to search created from parsing XML file
<xtags:forEach select="//record">
<xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/>
<xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/>
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Search targets obtained from XML file using Xtags
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Querying the Z39.50 targets is easy!
Working with the data you get back is another story!
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differences
• Authentication– Username/passwords– IP authentication
• Z39.50 attributes– Word & WordList – Any & Anywhere
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesData Formats
– Marc
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesData Formats
– SUTRS (Simple Unstructured Text Record Syntax )
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesData Formats
– SUTRS (Simple Unstructured Text Record Syntax )
• Requires special processing to parse the “blob” and display the data
• Can’t merge, de-dup or sort these records
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
• Contains the journal title, ISSN, year, volume, issue, and pages
• Used for E-Journal Finder and SFX• Vendors use different subfields for this
information
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
773$t Pet Product News 773$x 0899-2177 773$g May 1997, v51, n5, p64(2)
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref.
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref.
period year
volume issue pages
title
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
Aquatic Toxicology [Acquit. Toxicol.]. Vol. 59, no. 3-4, pp. 163-175. 24 Sep 2002.
Review of Palaeobotany and Palynology, 119 (1-2) pp. 93-112, 2002
Indian-Journal-of-Animal-Sciences. 2002, 72: 12, 1122-1124; 10 ref.
History-and-Theory. My 02; 41(2): 250-263
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesSource information (773 field)
773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref.
Challenge – Get from this:
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
To This:
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is this accomplished?
• Study patterns in the 773 field for the database
• Write SFX source parsers for each format to parse the 773 field into separate field for ISSN, ISBN, volume, issue, start page and end page
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
How is this accomplished?• Store parser name for each database in a
database
• Lookup parser name and pass it to SFX in the OpenURL
sfx.lib.ncsu.edu:9003/ncsu?sid=MULTISEARCH:zsilver2&issn=1068-5472&isbn=&atitle=Phalaenopsis+orchid+plant+named+%27Anthura+Gold%27.&pid=US-pat-Plant.+%5BWashington%2C+D.C.+%3A+U.S.+Patent+and+Trademark+Office%2C+1976-.+May+21%2C+2002.+%2812%2C639%29+3+p.
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Vendor differencesFull Text
• 856$u - link or pdf• 900$a Magazine: Horticulture, December 2002
900$a SLIP INTO THE HOLIDAYS 900$a Whether you're in a mood to celebrate or not,
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Rollout of Service
• Group created to design page layouts and functionality
• Decided to display all databases on results page, not just ones with Z39.50 search capabilities
• Provide link to search the non-Z39.50 databases directly
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Rollout of Service
• Load tests to measure performance with more users
• Production – August 19th, 2002
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
MultiSearch & Database Finder Usage Statistics
April 2003 - Hits
• Homepage 272,583
– Database Finder 44,813
• Subject Pages 38,676
–MultiSearch 13,372
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Post Rollout
• Continued to work to add other vendors with Z39.50 access
• Changed the look of the subject page to make MultiSearch more noticeable
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
MultiSearch Version 2.0
• Converted from E-Journal Finder to SFX
• Advanced Search – allow users to select databases to search
• Merging, sorting, and de-duping results
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
MultiSearch Version 3
Non-Z39.50 databases
• Screenscaping vendor sites to search get number of results
• Link to vendor site for results
• Only do this for core databases
• Time consuming to maintain
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
MultiSearch Version 3
Download capabilities
• Mark citations for download to a file, email or bibliographic software
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Looking forward
Good service and tool for patrons for today
Technology is changing. New protocols coming. It will only get better and hopefully easier
Enhancing Access to Databases – LITA Forum, Norfolk 2003Shirley Rodgers and James M Jackson Sanborn
Two Problems, Two SolutionsOne Service
Demo of
Database Relevancy
&
MultiSearch