brief notes from kew mark jackson software applications manager

19
Brief Notes from Kew Mark Jackson Software Applications Manager

Upload: pedro-howick

Post on 01-Apr-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Brief Notes from Kew Mark Jackson Software Applications Manager

Brief Notes from Kew

Mark Jackson Software Applications Manager

Page 2: Brief Notes from Kew Mark Jackson Software Applications Manager

Focussing on...

Herbarium digitisation electronic Plant Information Centre

Page 3: Brief Notes from Kew Mark Jackson Software Applications Manager

Kew Herbarium Guesstimated

– 7 million specimens– 250,000 types

Less than 5% specimens databased

A variety of personal databases

Page 4: Brief Notes from Kew Mark Jackson Software Applications Manager

Preparation for Digitisation

Computerise transactions Agree and document policy and

procedures Establish core fields (HISPID

pending ABCD) Develop hardware and software

infrastructure (e.g. catalogue database, mass storage)

Page 5: Brief Notes from Kew Mark Jackson Software Applications Manager

Digitisation Strategy Curators to barcode, database and

image types for loan Repatriation & research projects

– to use infrastructure and core fields– data to be imported into Catalogue

(eventually) Pursue digitisation projects

www.kew.org/data/repatbr

Page 6: Brief Notes from Kew Mark Jackson Software Applications Manager

Specimen imaging Decision to try to match

Cibachrome prints in terms of quality (e.g. suitable for many diagnostic purposes)– 600 dpi delivers 200MB images

Stored as uncompressed (but bzipped) TIFFs

Acquisition of mass storage

Page 7: Brief Notes from Kew Mark Jackson Software Applications Manager

HerbScan

A3 flatbed scanner, inverted

Cradle for specimens

Distributed throughout Herbarium

Page 8: Brief Notes from Kew Mark Jackson Software Applications Manager

Pros and cons

£30-40,000 200MB images

barely achievable 1 image per minute Fixed Versatile

£7,500 200MB images

easily achievable 10 images per hour Some mobility Suited to flat items

200 MB master images (600 dpi scans), based on capturing the level of detail of Cibachromes.

Camera HerbScan

Page 9: Brief Notes from Kew Mark Jackson Software Applications Manager

HerbCat

ClientImage Server

ImagesMetadata

image enquiriesHerbCat enquiries

Page 10: Brief Notes from Kew Mark Jackson Software Applications Manager

Focussing on...

Herbarium digitisation electronic Plant Information Centre

Page 11: Brief Notes from Kew Mark Jackson Software Applications Manager

UK government funding for delivery of services electronically

Resource-discovery interface to multiple Kew data sources (not necessarily at Kew)

Data sources are heterogenous Simple interface overlaying other systems

ePIC Interface

Data source Data source Data source Data source

Page 12: Brief Notes from Kew Mark Jackson Software Applications Manager
Page 13: Brief Notes from Kew Mark Jackson Software Applications Manager
Page 14: Brief Notes from Kew Mark Jackson Software Applications Manager
Page 15: Brief Notes from Kew Mark Jackson Software Applications Manager

Data sources

Interface (java servlet)/JSPs

Multi-threaded Java server

Request queue

Handlers:one per data sourceone for loggingone for spell-checking

Requests

Data sources

Configuration files (XML)

Results

Architecture

Page 16: Brief Notes from Kew Mark Jackson Software Applications Manager

Web documents indexed using Lucene Flora Zambesiaca digitised and marked-up

with XML Experimentation with options for query and

output via Java servlet– using XSL to output selections– using Lucene to index the XML– importing the XML into a database

Other texts - jury still out, but Lucene route looks promising

Texts

Page 17: Brief Notes from Kew Mark Jackson Software Applications Manager

Feedback

Email mechanisms Web usability testing/focus groups Logging

– Quantitative success• levels of usage, patterns & trends• beware: crawlers, testing & development staff, harvesters • referring URLs, Google link: popularity of site• country, domain

– Qualitative success• success of queries esp. zero hits (spelling, common names,

families)• performance & system monitoring• number of queries per session, return visits• results pages viewed

Page 18: Brief Notes from Kew Mark Jackson Software Applications Manager

World distribution of queries

Page 19: Brief Notes from Kew Mark Jackson Software Applications Manager

www.kew.org/epic

Future

More data sources, including texts and images

Hierarchical browsing front-end based around revamped Brummitt Families & Genera with phylogenetic classification

Looking forward to – using the GBIF Names Service…– links with DiGIR/BioCASE resources...