a systematic overview of internet information sources [email protected] vrije...

58
A systematic overview of Internet information sources [email protected] Vrije Universiteit Brussel Informatie- en Bibliotheekwetenschap, Universitaire Instelling Antwerpen België Presentation at Internet Librarian International, in London, England, March 2002 These slides are available through the WWW from http://www.vub.ac.be/BIBLIO/nieuwenhuysen/presentations/

Upload: denis-quinn

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

A systematic overview of Internet information sources

[email protected]

• Vrije Universiteit Brussel

• Informatie- en Bibliotheekwetenschap, Universitaire Instelling Antwerpen

België

Presentation at Internet Librarian International, in London, England,

March 2002

These slides are available through the WWW fromhttp://www.vub.ac.be/BIBLIO/nieuwenhuysen/presentations/

Contents / summary

of this presentation

A systematic overview of information sources and services that are accessible through the Internet, such as

• general WWW directories and search engines, and

• more specialised systems to find

»images,

»books,

»journal articles,

»newsgroup messages.

Internet based information sources: problems / difficulties (Part 1)

• Redundancy and overlap:On the one hand, there is too much information on some topics; in other words, the redundancy and overlap are high in many cases. Too few information sources: On the other hand, there are too few information sources on some topics.

Internet based information sources: problems / difficulties (Part 2)

• No order is imposed on most sources.Quality checks / quality control are not performed.Related to this: it is not required to register new information offered. Is the information that you find real, honest, authentic?

Internet based information sources: problems / difficulties (Part 3)

• Change is the only constant: Information sources are constantly changing, growing, but sometimes disappearing.

Internet based information sources: problems / difficulties (Part 4)

• Scattering: There is no single simple but powerful system to find relevant information through the Internet.In other words: integration / aggregation is still far from perfect.

Internet based information sources: problems / difficulties (Part 5)

• Slow: The Internet is in many places and for many applications not yet fast enough.

Internet based information sources: problems / difficulties (Part 6)

• In conclusion: Surfing, using the Internet, the WWW, can be a time sink instead of a productive activity.

Types of online access information systems: “free” versus “fee”

Public access information sources free of charge

Fee-based online information services(NOT free of charge)

Types of online access information systems: “free” for members only

Public access information sources free of charge

Fee-based online information services(NOT free of charge)

Fee-based online information services, made accessible “free of charge”

by an institute to its members

Encyclopedias accessible through Internet and WWW

• Dictionaries and encyclopedias are the first choice among many types of information sources,

»when we do not need detailed information on a common topic

»when we want to prepare a more detailed search on an unfamiliar topic, by searching for the right spelling, synonyms, context,…

• Some dictionaries and encyclopedias are available through the WWW free of charge.

Encyclopedias accessible through Internet and WWW: examples

• Encarta Concise Free Encyclopedia 

»http://encarta.msn.com/

• Encyclopædia Britannica only a small part is available free of charge + links to selected WWW sites

»http://www.britannica.com/

• Encyclopædia Britannica Concise

»http://education.yahoo.com/reference/encyclopedia/

Example

Encyclopedias accessible through Internet and WWW: examples

• The Canadian Encyclopedia(in English and in French):

»http://thecanadianencyclopedia.com/

Example

Internet: subject-oriented meta-information offered via WWW

Information about information sources: in the form of

»subject hypertext directories = subject guides

»key word indexes, generated automatically, for searching

Internet global subject directories:introduction

• They are virtual libraries with open shelves, for browsing.

• They are manually generated, man-made by many people.

• They can be browsed following a tree structure or a more complicated variation.

• The most famous of these systems belong to the most popular and most visited sites on the WWW. For instance Yahoo!

Internet global subject directories: structure

The structure corresponds to a classification that is in most cases specific for the particular overview. In other words: the well-known and classical universal classification systems are not used in most Internet directories.

Internet global subject directories: limitations

• They cover only a small number of selected WWW sites, in comparison with the total number of sites that are accessible.

• They are suitable mainly for broad searches that can be difficult to formulate in words, but NOT for more specific searches that require combinations of several concepts.

Internet global subject directories: Yahoo!

• A hypertext global subject directory can be found at http://www.yahoo.com/

and at many other sites, includinghttp://www.yahoo.co.uk/

• Entries are NOT rated.

• Accessible free of charge.

Example

Internet global subject directories: BUBL link

• A hypertext global subject directory to more than 10 000 WWW sites for the higher education community can be found athttp://bubl.ac.uk/link/

• Accessible free of charge.

Example

Internet global subject directories: Google directory

• A hypertext global subject directory can be found athttp://directory.google.com/

• Accessible free of charge.

• Very similar to the Open Directory Project.

Example

Internet subject directories focusing on a specific subject domain (Part 1)

• Computer science & engineering: http://www.ub.lu.se/eel/

• Social sciences: http://www.sosig.ac.uk/

• Marine science and oceanography: http://oceanportal.org/

Examples

        

Internet subject directories focusing on a specific subject domain (Part 2)

• Medicine and healthcare: http://www.omni.ac.uk/ and http://www.medscape.com/

• General pediatrics: http://GeneralPediatrics.com and http://www.pedinfo.com/

• Engineering: http://www.eevl.ac.uk/

• Civil engineering: http://www.icivilengineer.com/

Examples

        

Internet subject directories focusing on a specific subject domain (Part 3)

• Fishing: http://www.onefish.org/

• Art, architecture and the media: http://www.adam.ac.uk/ or http://adam.ac.uk/

Examples

        

Internet indexes:automated search tools

• Several systems allow to search for and to locate many items (addressable resources) in the Internet in a more systematic, direct way than by only browsing/navigating.

• These systems do NOT search the contents of computers through the real Internet in real time and completely when a user makes a query. Searching in that way would be much too slow due to limitations in the technology.

Internet indexes: scheme of the mechanism

User searching for Internet based information

Internet client hardware and software

user interface to a search engine Internet information source

Internet index search engine Internet crawler and indexing system

database of Internet files, including an index

Internet indexes: Google (Part 1)

• You can search for WWW pages at http://www.google.com/

• The “simple search” option does NOT offer/allow

»full Boolean searches;

»stemming/truncation.

Example

Internet indexes: Google (Part 2)

• For retrieval an algorithm is used that takes into account the links between WWW pages.A retrieved page is ranked higher when

»many sites/pages point to it

»“important” sites/pages point to it

• Searches include full text searching of files on the WWW; not only html pages, but also files in the formats Adobe PDF, Microsoft Word, Microsoft Excel,…

Example

Internet indexes: Google additional features

• Google offers besides a system to search for WWW pages also

»a subject directory

»searching for images on the WWW

»searching an archive of Usenet messages + posting to Usenet groups

• Thus Google has become a great integrator / aggregator.

Example

Internet indexes: coverage / size of each index

The indexes grow and their “size ranking” is variable.

Biggest systems in 2002:

• AltaVista

• Fast = All the Web

• Google

• Systems based on the INKTOMI database of WWW pages, such as Hotbot, MSN Web search,…

Internet indexes cover only a part of the Internet: introduction

The “visible” part of Internet

The “hidden, invisible” part of Internet and the WWW, (that is not searchable using a global index

like, AltaVista, Google...)

Current awareness services focusing on WWW pages: introduction

• Tracking changes in one or more public access pages on the WWW or finding new pages, is possible

»by using one of the available, suitable, programs loaded on your client workstation

»through “alert” services based on a server on the WWW

—that track updates for the user/subscriber

—and send alerts by email to the user/subscriber

• Few systems are free of charge.

Current awareness services focusing on WWW pages: Tracerlock

• http://www.tracerlock.com/can use one of several external Internet indexes with a simple search query given by you, to discover relevant changed or new WWW pages for you in the future

Example

Internet information sources

Coverage of Internet directories and Internet indexes

A global Internet index

A global Internet directory

Global Internet search tools: a comparison

Global Internet directories

• Only a limited selection of Internet sources

• Browsing information sources is easy

• Good for broad searches

Global Internet indexes

• About 1/3 of the Internet is covered by an index

• Searching requires some skills and knowledge

• Good for specific, narrow searches

Multi-threaded search systems

• These get information from directories and indexes

• Searching requires some skills and knowledge

• Good when even 1 index does not yield information

Finding images on the Internet:introduction

• Several public access search systems are available free of charge to search for images / pictures (either artwork, either photos, or both) on the Internet.

• When searching for images, the search results from such a system offer not only links to the image files on the Internet, but also directly small versions of the images (so-called “thumbnails”).

Examples

Finding images on the Internet:examples of search engines

• http://alltheweb.com !

• http://gallery.yahoo.com/ !

• http://images.google.com/ !!! or through http://www.google.com/

• http://www.altavista.com/ !!(also audio and video, choose not the normal text search, but IMAGES in the user interface.)

• http://www.ditto.com/ !

Examples

Finding images on the Internet:screen shot of a Google image search

Public access book databases: introduction

Public access book databases: introduction

• Even in this age of Internet-based information sources, a lot of information is still distributed in the form of printed books.

• The contents of most books is (still) not available on the Internet.

• Most Internet search tools do NOT allow you to find out about the existence of books that may be interesting for you.

• So, specific search tools to find books can be useful.

Public access book databases provided by bookshops

Public access book databases provided by bookshops

• To find currently available books, the bibliographic databases assembled by big bookshops are interesting.

• Several offer a good coverage and are accessible free of charge.

Book databases accessible free of charge: examples (Part 1)

Book databases accessible free of charge: examples (Part 1)

• Amazon.com (US):http://www.amazon.com/ http://www.amazon.co.uk/ note: amazon, NOT amazone

• Barnes and Noble (US):http://www.bn.com/

• Blackwell’s on the Internet (International, academic books):http://www.blackwell.co.uk/

Examples

Free public access bibliographic book database + price comparisons

Free public access bibliographic book database + price comparisons

• Even comparisons of the catalogues of shops of books (as well as of music, movies and many other goods) are available free of charge.

• See for instance

»http://www.bookfinder.com/

»http://www.dealtime.com/

Online Public Access Catalogues of libraries

Online Public Access Catalogues of libraries

• Mainly to find older books, the catalogues of libraries can be useful.

• Most are accessible online and free of charge.

Types of online access information systems: “free” versus “fee”

• A lot of the information on the Internet is available free of charge, but another part is only accessible when a fee is paid to the producer and / or the distributor.

• Some organisations pay these fees for some sources and then organise access, so that the members of the organisation can retrieve and exploit the information as if it is free of charge.

• The first commercial computer systems that make information available online were born around 1975.

• Most of them are now also available through the Internet.

Online information services:total size of their databases

Online information services:total size of their databases

In 1999:

The big host systems and the public access WWW pages offer a comparable quantity of information:

• WWW offered about 8 terabytes (= 8 000 gigabytes) of text data

(according to Lawrence and Lee Giles, Nature, 1999, Vol. 400, pp. 107-109.)

• Dialog offered about 9 terabytes (= 9 000 gigabytes) (in 1998)

»6 billion pages of text

»3 million images

Online access databases about journal articles: overview

Online access databases about journal articles: overview

• Thousands of fee-based online access databases offer bibliographies or full-texts of journal articles in particular subject domains.

• Only few large databases offer access to bibliographies of articles published in journals, free of charge.

Online access databases about journal articles: Article@INIST

Online access databases about journal articles: Article@INIST

• Article@INIST allows you to search in a bibliographic database, NOT full-text (Journal articles, Journal issues, Books, Reports or Conferences, doctoral dissertations) at the Institut de l'Information Scientifique et Technique, France.

• Searching is free of charge.

• Available fromhttp://form.inist.fr/public/eng/conslt.htm

• Payment is required to receive the full text of an article.

Example

Computernetwork

interest group system

Computer-network interest groups:the basic scheme

? Question ?

! Answer ! E-mail

Computer-network interest groups:various existing systems

• “Conferences” on computer-services like AOL, CompuServe, Dialog, Data-Star, many Bulletin Board Systems

• E-mail lists !

• Usenet News !

• Furthermore, since the 1990s, the WWW has become a gateway to these and a basis for similar systems.

E-mail - based interest groups: synonyms

E-mail (based) conferences

Computer (based)(discussion) lists

Network (based) discussion groups

forums

interest groups

Listservs

Reflectors

Aliases

E-mail - based interest groups: How to find relevant groups?

You can

• (use printed directories of interest groups)

• use subject-oriented indexes and directories to search for Internet-based sources in general

• search directory files concerning interest groups online!Examples:

»http://groups.yahoo.com/

»http://www.forumone.com/

»http://www.liszt.com/

Usenet News:what it is

• Usenet is a worldwide computer-network conferencing system.

• Usenet is the set of people who exchange articles tagged with one or more universally recognized labels, called “newsgroups” (or “groups” for short).

• Usenet Usenet server computers Usenet clients

Usenet newsgroups form a hierarchical structure

• Newsgroups are organized by subject area into a multi-layered hierarchy ( = tree-structure).

• This helps the users to find the “right” newsgroups.

Usenet newsgroups form a hierarchical structure: an example

... alt comp rec sci soc talk ...

... binaries infosystems lang os ...

... gopher kiosks wais www ...

... misc providers users ...

Example

Computer-network interest groups: online searchable directories

About both e-mail groups and Usenet newsgroups

• http://paml.alastra.com/

• http://tile.net/

• http://www.liszt.com/

• http://www.mailbase.ac.uk/kovacs/

• http://www.meta-list.net/

Future trends in computer-network interest groups: usage

• Increasing number of interest groups

• Increasing number of readers / users

Online access information: future trends

Online access information: future trends

• An increasing amount of information becomes available online.

• A growing amount of this online information becomes available free of charge.

• The quality of server and client software is growing.

A consequence is:

• An increasing number of end-users searching for information online.

Online access information: conclusion

Online access information: conclusion

• In the case of simple information needs, the WWW and the search tools can work like “magic”.

• However, in the case of more complicated information needs, there is still is no “magic button” that brings you immediately to all the required information.

****

Thank you

Any questions?

The slides are available through the WWW fromhttp://www.vub.ac.be/BIBLIO/nieuwenhuysen/presentations/