Transcript

ESSAY

Online Zoological Collections of Australian Museums (OZCAM):

a national approach to making zoological data available on the

web

Elycia J. WALLISCollection Information Systems, Museum Victoria, Melbourne, Victoria, Australia

AbstractMuseums increasingly seek to publish data on the internet. However, individual records from a natural sciencesdatabase are not meaningful. The combined results of a number of datasets can be made much more meaningful byusing a graphical display. In this paper, a project undertaken by a group of Australian natural sciences collectinginstitutions is described. Issues about data quality and other aspects of data management are discussed.

Key words: collection,database, graphical .

Correspondence: Elycia Wallis, Collection Information Systems,

Museum Victoria, PO Box 666, Melbourne, Vic. 3001, Australia.

Email: [email protected]

INTRODUCTIONNatural sciences collections in museums reflect the past

and present biodiversity of the world and the structure ofthe Earth itself. In museums, they usually consist ofzoological, paleontological and geological collections.Natural sciences catalogs are usually highly structured withspecific fields to record specific data. They are also char-acterized by having few fields that contain interpreted data(information). Natural sciences collections also have manyspecimens of each species, each collected in a differentplace at a different time. As a result, each individual speci-men record carries little special meaning. The user can dis-cover the identification of the specimen (always recordedas a scientific name, or Latin binomial), the details of howthat particular specimen was collected, and a history oftransactions, such as loans, that have involved thatspecimen. However, the catalog would not usually storeinformation about the species that a single specimen

represents. Information such as the distribution of thespecies, how common the species is or whether the habitatof the species has been affected recently by bushfires, forexample, would not be stored in the object catalog. In short,object catalogs for natural sciences collections would notmake compelling reading online without an interpretiveoverlay.

Technology has advanced so that it is possible to pro-vide users with a retrieval mechanism that is sophisticatedenough to take the data in natural sciences collections andadd meaning through automation. This provides the userwith greater autonomy to phrase their own questions andmake their own interpretations (Cameron 2001). Embeddedin this idea is one about how different data have differentmeanings for different people, a point Marty (1999) makes.Thematic or guided online presentations of collection in-formation presuppose what the user might want to know.If the user is not interested in the material presented theywill simply leave the site. However, interfaces that permitusers to frame their own questions, and interpret the an-swers provided using their own frames of reference, arelikely to encourage the user to stay longer and “play”.

The internet is a collaborative medium (Zorich 1997) andinstitutions can take advantage of this to provide users

© 2006 ISZS, Blackwell Publishing and IOZ/CAS

Integrative Zoology 2006; 2 : 78-79 doi: 10.1111/j.1749-4877.2006.00018.x

with a comprehensive and satisfying user experience. Acase study of one such project, OZCAM, follows.

DISCUSSIONThe Online Zoological Collections of Australian Muse-

ums (OZCAM) project is a collaboration between all themajor and several minor natural sciences collecting institu-tions in Australia. The result can be viewed at www.ozcam.gov.au. At present the searchable part of the site is pass-word protected, but public access will soon be available.

The aim of the project is to provide an integrated, online,searchable database of natural sciences specimens in Aus-tralian museums. A distributed dataset model is used. Inthis model, institutions host their own data, rather thansending it away to a central location. This has advantagesin that it does not matter what collection management soft-ware is used as long as the data can be easily extracted. Italso allows the data to be updated easily. However, someinstitutions have encountered disadvantages. Those mu-seums that lack a fast internet connection have experiencedproblems with timing out, where an answer to a query can-not be returned within the specified time limit (18 s). Forsome institutions the interim solution was to host theirdata on a remote server, which has created difficulties withupdating.

Most institutions contributing data to OZCAM haveadopted a “published dataset” model by which they ex-tract a subset of data from the collection managementsystem, according to elements defined in an XML schema,and publish this subset of data onto a Web-accessibleserver. A wrapper, or small piece of code, is added to theweb server and acts as an intermediary between the datasetand the portal software embedded on the central website.When a user enters a query on the OZCAM website, theportal software sends simultaneous queries to all partici-pant institutions. The wrappers at each institution receivethe query, extract data from the dataset and return it inXML format to the portal. The portal then displays theresults to the user, either as a list of data (showing indi-vidual specimen records) or as dots on a map. The resultscan be further defined with the dots color-coded either byspecies, or by institution, to see where each institutionhold specimens from.

There are a number of issues that can be raised about asystem such as OZCAM. A principal one is data quality.Unless data undergo some sort of quality control processthe results can look untidy. However, data cleaning projectsare time-consuming and labor intensive so are often notundertaken. The quality of the data published also reflectswhat is held in the source database. It is true that the “de-

livery systems now available on the web have outstrippedthe content currently held in collection databases”(Cameron 2001). This comment is applicable to both thequality of data and the quantity. Collection managementand curatorial staff struggle to find the resources to do theroutine work of registering new specimens into the collec-tion management system. Also, in some groups of animalsthere is a problem of identification. While the fauna is well-known for groups such as mammals and birds, in groupssuch as marine invertebrates there are many species yet tobe formally named and described. For these groups, scien-tists often resort to using codes or letters to identify thespecies, which does not provide a good means for resourcediscovery. Another issue is one of setting a minimum datasetbased on a standard. The OZCAM dataset is based on astandard called Darwin Core, an increasingly well-recog-nized standard. One principal issue with standards is thatthere are a number to choose from, each with a slightlydifferent emphasis. The decision must simply be based onexpediency; that is, choose the standard that best suitsyour purpose.

A final issue is the way data are accessed. Currently theonly way into the data is through a search interface. Thus,the user must know the scientific name of a species inorder to launch a search. Although this is not a problem forspecialist users, it is problematic for non-specialists, whomay not know how the data are modeled or appropriatesearch terminologies to use (Cameron 2001). This has al-ready been flagged as an issue and some options will beintroduced. These include providing picklists for somefields, common name searching, and ways for users tobrowse to find the search terms they want rather than be-ing required to type the terms in.

In conclusion, museums are increasingly publishing dataon the internet and wish to do so in a meaningful way. Anexample of one project in Australia is described and someof the issues raised are discussed.

REFERENCESCameron F (2001). World of museums. Wired collections:

the next generation. Museum Management and Cura-torship 19, 309–15.

Marty PF (1999). Museum informatics and informationinfrastructures: supporting collaboration across intra-museum boundaries. Archives and Museum Informatics13, 169–79.

Zorich DM (1997). Beyond bitslag: integrating museumresources on the internet. In: Jones-Garmil K, ed. TheWired Museum. American Association of Museums,Washington, DC, pp. 171–202.

© 2006 ISZS, Blackwell Publishing and IOZ/CAS

Australian online zoological collections


Top Related