scaling-up collections digitisation

15
Scaling-up collections digitisation Vincent S. Smith Vladimir Blagoderov, Ian Kitching & Thomas Simonsen

Upload: vincent-smith

Post on 10-May-2015

1.429 views

Category:

Documents


2 download

DESCRIPTION

A Science Information Committee (SIC) presentation authored by Smith, V.S., Blagoderov, V., Kitching, I. and Simonsen T., given at the Natural History Museum, London, UK. May 14th, 2010.

TRANSCRIPT

Page 1: Scaling-up collections digitisation

Scaling-up collections digitisation

Vincent S. SmithVladimir Blagoderov, Ian Kitching & Thomas Simonsen

Page 2: Scaling-up collections digitisation

“the rate of progress by the UK taxonomic institutions in digitising and making collections information available is disappointingly low… there is a significant risk of damage to the international reputation of major institutions such as The Natural History Museum”

House of Lords Science and Technology CommitteeReport on Taxonomy and Systematics, 2009

Page 3: Scaling-up collections digitisation

Rate of digitisation at the NHM

Page 4: Scaling-up collections digitisation

Specimen focus

Page 5: Scaling-up collections digitisation

SatScanTM (by SmartDrive)

Page 6: Scaling-up collections digitisation
Page 7: Scaling-up collections digitisation

Example outputs

Diptera: http://sciaroidea.info/node/44309

Coreidae: http://sciaroidea.info/node/44310

Page 8: Scaling-up collections digitisation

Sackler Lab Trials Nine test projects over 1 month (ent. bot. & palaeoent.) - Assess utility for coll. management and research - Understand technical & practical limitations

Key Facts • Minimal resolved structures: 0.06 - 0.1 mm• Depth of field: 10 - 80 mm• File size (15000 x 14000): 340Mb (TIFF)• Scanning time (45 x 50 cm): 5-7 min, depending on exposure• Stitching time, 200-400 tiles: 5:30-9:30 min (batchable, overnight)

Page 9: Scaling-up collections digitisation

Sackler Lab TrialsAperture, Exposure, Depth of Field & Resolution

11 81041Exposure (ms)

DoF (mm) 6 8017

Smallest resolvable structure (µm)

56 9859

Open ClosedMidwayAperture

Page 10: Scaling-up collections digitisation

General points

Implications

Entomology dept.

• Best suited to drawers of numerous, uniformly positioned, med. size spec.• Excellent results with large and medium-size beetles, moths and butterflies• Sufficient information is usually preserved to allow id. for these specimens• Objects less than 10 mm could not be imaged so adequately• Such images could be used in other ways• Specimen labels and barcodes (when not obscured) could be easily read

from the digitised image

• Of the 135,000 draws in Entom., 85,000 could be usefully imaged at the current level of resolution with this system

• This work could be completed in ~2024 person-days (ten person-years) using one system

• Other lens / camera options might be explored to image remaining draws

Page 11: Scaling-up collections digitisation

Caveats

• Metadata• Utility of surface (usually dorsal) view images - not a panacea• Assigning specimen level identifiers (physical, virtual or both)• Image storage (85k stitched images = 28,222 GB or 27.6TB)• Software workflow (managing identifiers, cropping etc)• Integration with existing systems (KeEMu and DAMS)• Challenges to research & collection management processes (e.g.

staff time, curation activities)• Cost: Circa £50k (outright purchase) or £2k per month hire

NHM Issues

• Max. scanning area ~ 500 x 600 mm – insufficient for some drawers• Occasional errors during scanning and stitching• Focusing (currently time consuming)• Inconvenient access to scanning area

Hardware / Software issues

Page 12: Scaling-up collections digitisation

Metadata capture is rate limiting

• Specimen images & metadata need not be captured together• Link back together through common identifiers• Specimen level identifiers can be physical, virtual or both• Assignment of virtual identifiers might be automated• Prioritise metadata capture on research & collection activities• Image and re-image as required• Crowd source metadata capture, assignment of identifiers and

image cropping

Page 13: Scaling-up collections digitisation
Page 14: Scaling-up collections digitisation

• Acquiring images for use with automated identification software• Manual identifications• Morphometric analysis of specimens• Support the monitoring of environmental change• Supporting biodiversity conservation research• Studies on colour pattern variations

Possible Applications

• Accurate specimen counts for the entire collection• Collections audit and security• Improving accessibility to the entire collection• Saving curator & visitor time• Improving curation• Updating identifications (crowdsourcing possibilities) • Encouraging typification (discovery of unrecognized/unlabelled types)• Populating KE EMu

• Visual & engaging equipment on display in Sackler Lab.• Innovating crowd sourcing possibilities with the public• Meets NHM strategic commitments on collection accessibility

Collection management

Research

Public engagement

Page 15: Scaling-up collections digitisation

Next Steps…

• Metadata• Utility of surface (usually dorsal) view images - not a panacea• Assigning specimen level identifiers (physical, virtual or both)• Image storage (85k stitched images = 28,222 GB or 27.6TB)• Software workflow (managing identifiers, cropping etc)• Integration with existing systems (KeEMu and DAMS)• Challenges to research & collection management processes (e.g.

staff time, curation activities)• Cost: Circa £50k (outright purchase) or £2k per month hire

Larger Scale Project to address NHM Issues

Acknowledgements

• Smart drive Ltd (esp. Mike Broderick & Dennis Murphy)

http://sciaroidea.info/sites/sciaroidea.info/files/SatScanTrialReport.pdf