no specimen left behind: industrial scale digitisation of natural history collections

1
No Specimen Left Behind Industrial scale digitisation of natural history collections Vladimir Blagoderov & Vincent Smith Natural History Museum London The Problems With Digitisation the rate of progress by the UK taxonomic institutions in digitising and making collections information available is disappointingly low. Unless a more strategic view is taken... there is a significant risk of damage to the international reputation of major institutions ” At present rates it would take approximately 900 years to get the data off the Natural History Museum’s 70 million specimens, and 500 years to take the pictures. New technologies have the potential to speed this to completion within twenty years. These advances mean that selecting specimens for digitisation now takes longer than the act of digitising them. Metadata capture is the rate-limiting step, but using a staged approach even this can be overcome. This offers the potential to build comprehensive digital museum, with new ways to use and organise the collection. In collections management Accurate specimen counts for the entire collection Collections audit and security Saving curator & visitor time Improving curation Updating identifications online Encouraging typification (discovery of unrecognized types) Populating specimen databases In research Acquiring images for use with automated identification software Supporting manual taxonomic identifications Morphometric analysis of specimens Support the monitoring of environmental change Supporting biodiversity conservation research Studies on colour pattern variations In public engagement Visual & engaging equipment on public display Innovative crowd sourcing possibilities with the public Meets museum strategic commitments on collection accessibility The Current Situation The Potential 2009 UK House of Lords Science and Technology Committee Report on Taxonomy and Systematics The daunting task of digitising natural history collections PHOTO: L . Livermore Our Solution SatScan SatScan Key Facts Objects >10mm usefully digitised at standard resolution i.e. 85 k of the 135 k collection draws in NHM Entomology collection The process would take 8 years for 1 person and 1 machine Higher resolution options available at lower throughput rates 1, 2 & 4 k dpi options, corresponding depth of field 3cm - 5mm and file sizes of 300MB - 4.8GB Software setup wizard enables use by untrained volunteers Metadata Creator Tool for rapid metadata collection Images cropped back to individual specimens Physical identifiers ( barcodes ) permanently link the specimen, image & metadata together The effort of individually handling the NHM’s 70 million specimens would be enormous, but most specimens are grouped in such a way that makes them much easier to handle. Eg. the entomology department has 28 million specimens in just 135,000 drawers. Working with SmartDrive, we have developed a machine ( SatScan ) that can produce a ultra-high resolution digital image of a draw in 5 minutes. From this image we can examine the specimens in detail. The machine is combination of hardware and software that provides automated capture of lower resolution images, which are then assembled ( stitched ) into a larger panoramic image, generating an extremely high resolution final image. A telecentic camera with the attached lens is moved in two dimensions along precision rails positioned above the imaged object. This method maximises depth of field of the captured images and minimises distortion and parallax artifacts. The Workflow Next Steps The SatScan system is primarily used by collections management but on an ad hoc basis We need to develop a more comprehensive program of digitisation for an exemplar collection Target groups include the synoptic British collections ( insects & plants ), Lepidoptera and lichens We are working on a web interface to crowd source image cropping and metadata collection The NHM Digital Asset Management System and our Collections Management System are being integrated into this process We plan to assign DataCite DOIs to each specimen, its metadata record and its digital image http: // vbrant . eu / http : // www. smartdrive .co. uk / Acknowledgements Read more: http : // hdl . handle . net /10101/npre . 2010. 4486.1 SmartDrive D. Murphy NHM I . Kitching T. Simonsen Poster Design M . Nikunlassi L . Livermore

Upload: vincent-smith

Post on 10-May-2015

761 views

Category:

Technology


4 download

DESCRIPTION

Blagoderov V. and Smith V.S. 2011. No Specimen Left Behind: industrial scale digitisation of natural history collections. Life and Literature, Biodiversity Heritage Library conference, Chicago, Illinois, USA, 14 – 15 November, 2011.

TRANSCRIPT

Page 1: No Specimen Left Behind: industrial scale digitisation of natural history collections

No Specimen Left BehindIndustrial scale digitisation

of natural history collectionsVladimir Blagoderov & Vincent Smith  •  Natural History Museum London

The Problems With Digitisationthe rate of progress by the UK taxonomic institutions in digitising and making collections information available is disappointingly low. Unless a more strategic view is taken... there is a significant risk of damage to the international reputation of major institutions ”

• At present rates it would take approximately 900 years to get the data off the Natural History Museum’s 70 million specimens, and 500 years to take the pictures.

• New technologies have the potential to speed this to completion within twenty years.

• These advances mean that selecting specimens for digitisation now takes longer than the act of digitising them.

• Metadata capture is the rate-limiting step, but using a staged approach even this can be overcome.

• This offers the potential to build comprehensive digital museum, with new ways to use and organise the collection.

In collections management

• Accurate specimen counts for the entire collection

• Collections audit and security

• Saving curator & visitor time

• Improving curation

• Updating identifications online

• Encouraging typification (discovery of unrecognized types)

• Populating specimen databases

In research

• Acquiring images for use with automated identification software

• Supporting manual taxonomic identifications

• Morphometric analysis of specimens

• Support the monitoring of environmental change

• Supporting biodiversity conservation research

• Studies on colour pattern variations

In public engagement

• Visual & engaging equipment on public display

• Innovative crowd sourcing possibilities with the public

• Meets museum strategic commitments on collection accessibility

The Current Situation The Potential

2009 UK House of Lords Science and Technology Committee Report on Taxonomy and Systematics

The daunting task of digitising natural history collections PHOTO: L . Livermore

Our SolutionSatScan

SatScan Key Facts

• Objects >10mm usefully digitised at standard resolution

• i.e. 85 k of the 135 k collection draws in NHM Entomology collection

• The process would take 8 years for 1 person and 1 machine

• Higher resolution options available at lower throughput rates

• 1, 2 & 4 k dpi options, corresponding depth of field 3cm - 5mm and file sizes of 300MB - 4.8GB

• Software setup wizard enables use by untrained volunteers

• Metadata Creator Tool for rapid metadata collection

• Images cropped back to individual specimens

• Physical identifiers ( barcodes ) permanently link the specimen, image & metadata together

The effort of individually handling the NHM’s 70 million specimens would be enormous, but most specimens are grouped in such a way that makes them much easier to handle. Eg. the entomology department has 28 million specimens in just 135,000 drawers.

Working with SmartDrive, we have developed a machine ( SatScan ) that can produce a ultra-high resolution digital image of a draw in 5 minutes. From this image we can examine the specimens in detail.

The machine is combination of hardware and software that provides automated capture of lower resolution images, which are then assembled ( stitched ) into a larger panoramic image, generating an extremely high resolution final image.

A telecentic camera with the attached lens is moved in two dimensions along precision rails positioned above the imaged object. This method maximises depth of field of the captured images and minimises distortion and parallax artifacts.

The Workflow Next Steps• The SatScan system is primarily used by collections

management but on an ad hoc basis

• We need to develop a more comprehensive program of digitisation for an exemplar collection

• Target groups include the synoptic British collections ( insects & plants ), Lepidoptera and lichens

• We are working on a web interface to crowd source image cropping and metadata collection

• The NHM Digital Asset Management System and our Collections Management System are being integrated into this process

• We plan to assign DataCite DOIs to each specimen, its metadata record and its digital image

http: // vbrant . eu /http : // www. smartdrive .co. uk /

Acknowledgements

Read more: http : // hdl . handle . net /10101/npre . 2010. 4486.1

SmartDrive

D. Murphy

NHM

I . KitchingT. Simonsen

Poster Design

M . NikunlassiL . Livermore