searching for nz information in the virtual library alastair g smith school of information...

Post on 18-Jan-2016

221 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Searching for NZ Information in the Virtual

Library

Alastair G SmithSchool of Information

ManagementVictoria University of

Wellington

Overview

Search engines: local vs global Search engines: limitations Searching for NZ info: effective

strategies Information Quality on the Web Making NZ info more accessible:

the role of librarians

NZ information online

Online access can mean that US, European Information is easier to access than NZ E.g. Dialog

However Internet provides accessible infrastructure for making NZ information available E.g. Knowledge Basket

Search tool definitions

Directories: resources categorised by human beings: e.g. Yahoo! Te Puna Web Directory

Search engines: automatically created databases of web pages, searchable by keyword e.g. Google, SearchNZ

Role of Search Engines

Convenient, fast, usually find some information (if not most relevant)

Most people turn to a search engine first (GVU user survey: 85%)

For NZ Information we have a choice: Global search engines, e.g. Google Local search engines, e.g. SearchNZ

Comparing NZ and global search engines

Experiment compared NZ, global and metasearch engines

Test questions on NZ topics Compared relative recall

Global Search Engines

AlltheWeb/FAST http://www.alltheweb.com/

Google http://www.google.co.nz/ HotBot http://hotbot.lycos.com/ Altavista http://nz.altavista.com/

Local Search Engines

SearchNZ http://www.searchnz.co.nz/

SearchNow http://www.searchnow.co.nz (no longer exists)

NZExplorer http://nzexplorer.co.nz/

Metasearch engines

Excite http://www.excite.com/ Vivisimo http://vivisimo.com/ Surfwax http://www.surfwax.com/

Examples of test questions

A description and image of the Maori flag

Information about the Otago Central Rail Trail

Information on the payment of British pensions in NZ

Recall

Recall: proportion of possible relevant documents found in search, e.g. 100 relevant documents in database Search finds 20 relevant documents Recall is 20%

Problems in using recall to evaluate search engines:

Don’t know total number of relevant documents on Web

Ranking: Is document “found” if it appears in first 10, first 20…?

Relative Recall

A

B

C

Pool results of search engines A, B, C: approximates to all relevant documents

Recall in NZ search engine experiment

“First 20 relative recall” Noted URLs of relevant documents

found in first 20 hits for each search engine

Pooled results for all search engines Used pooled list as approximation

of all relevant documents

Recall results

0

5

10

15

20

25

30

35

40

45

Google

AltaVist

a

AlltheW

eb

HotBot

Searc

hNZ

Searc

hNow

NZExplor

er

Surfw

ax

Vivisim

o

Excite

rela

tive

rec

all (

%)

Points arising from recall results

Only one local search engine equalled global search engines

No search engine found over half of relevant documents

Metasearch engines did not outperform standalone search engines

Comparison with 2000 Relative recall for NZ questions

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

NZ Explorer SearchNZ WebSearchNZ

ANZWERS Excite Aus GoEureka AllTheWeb Google

Factors affecting performance of NZ search

engines Global search engines have similar

or larger coverage of .nz sites NZ search engines have less

sophisticated search features 36% of sites relevant to NZ topics

were outside .nz domain Global search engines update

more rapidly

Overlap of search engine hits

Overlap of search engine hits

0

10

20

30

40

50

1 2 3 4 5 6 7 8 9 10Number of search engines

Nu

mb

er o

f h

its

Implications of overlap results

Most sites only found by one search engine

Few sites found by 7 or more search engines

Little overlap Comprehensive searches require

several search engines

Why aren’t metasearch engines better?

Metasearch engines select a few top ranked items from each search engine list

Search engine ranking imperfect Looking at more results from one search

engine may be as useful as looking at a few from each

Metasearch engines use “lowest common denominator” search

But can be useful for specific terminology

Limitations of Search Engines for finding NZ

information “hidden web” How does a search engine work?…

Search engine architecture

Interface

Query Engine

Indexer

Crawler

Index

WEB

Users

Search engine limitations:

Spider can’t access some types of pages: database, frames, javascript…

Only 40% of pages are highly linked, others difficult for spider to locate

Search is of database: “some of the pages that once existed on the Web”

Spider may be optimised for popular sites rather than full coverage

Implications for Internet search strategy for NZ

topics Use several search engines Avoid restricting search to .nz domain Don’t rely on search engines to find

everything Use directories, subject resource guides Use as many words as possible to

describe your topic: optimise relevance

NZ directory examples

NZ Subject Resource Guides

Searching in practice…

Quality of NZ information on the Web

Like global information, and information in print: variable

NZ Information quality examples

Role of librarians in making NZ internet

information available Sharing our knowledge of web

navigation…

…Creating search tools and information resources

…Preserving Internet information

Conclusion

NZ search engines do not offer advantages over global search engines

Comprehensive searches involve several search engines, directories, subject guides

Librarians have a role in creating local search tools, and in improving search skills

top related