price2 ecn2013

28
Rapid, industrial scale digitization of the NHM microscope slide collection Ben Price & Vladimir Blagoderov

Upload: ecnofficer

Post on 10-May-2015

262 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Price2 ecn2013

Rapid, industrial scale

digitization of the NHM

microscope slide collection

Ben Price & Vladimir Blagoderov

Page 2: Price2 ecn2013
Page 3: Price2 ecn2013

Outline

• The NHM slide collection

• What is Digitization?

• The NHM workflow

• Psyllid collection

• Future prospects

Page 4: Price2 ecn2013

The NHM slide collection

• ~ 2 million slides (60 : 40 vertical : horizontal storage)

Page 5: Price2 ecn2013

The NHM slide collection

• Mix of slide sizes, mounts, storage cabinets

Page 6: Price2 ecn2013

What is Digitization?

?

Page 7: Price2 ecn2013

What is Digitization?

Label data:

– Quick to image

• 5000 per day

– Slow to transcribe (crowdsourcing)

– Slow to georeference (crowdsourcing)

Page 8: Price2 ecn2013

What is Digitization?

Specimen:

– Slow to image

• 100,000 per year

– Data storage

• GB images

– Image delivery

• Proprietery software

– Do we need ALL specimens?

Page 9: Price2 ecn2013

The NHM workflow*

PreparationHandling Imaging Post ProcessingData Capture

* Work in progress

Page 10: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Datamatrix Labels (4.5mm)

• Processing Scripts (GIMP, Barcodefiler)

• Computing Facilities (64bit, 16GB RAM)

• Storage & Retrieval (Ke-EMu)

– What is a slide?

• Delivery (NHM data portal)

Page 11: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Horizontal vs Vertical storage

• Card Slide covers!

• Labelling & Handling = up to 90% of the time

Page 12: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Scanner – SLR – Mamiya Leaf – SatScanner

• Balance slides per image vs label resolution (PPI)

• Single slide imaging?

Page 13: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Horizontal Storage:

• Less handling

– Tray fits A3 scanner / SLR

• Can be autocropped

Page 14: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Horizontal Storage:

• Less handling

– Tray fits A3 scanner / SLR

• Manual cropping

– Crowd cropping?

Page 15: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Vertical storage:

• Single type of template (post processing)

• High contrast (scripts)

• Cheap (foam, card)

• More Handling

• Autocropping

Page 16: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Resolution tests (PPI)

– Canon 650D (18MP sensor) + 50mm Macro

300 450 600250

Slides

PPI

45 18 1072

Page 17: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Resolution tests (PPI)

– Mamiya Leaf (80MP sensor) + 80mm lens

Slides

PPI 450

72

300

180

600

50

Page 18: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Resolution tests (PPI)

– HerbScanner (EPSON A3 size)

Slides

PPI 450

50

300

50

600

50

Page 19: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

• Resolution tests (PPI)

– SatScanner (0.16x lens, low resolution ~1000 PPI)

72 - 100Slides

Page 20: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Page 21: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Page 22: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Page 23: Price2 ecn2013

Preparation Handling Imaging Post Processing Data Capture

Page 24: Price2 ecn2013

Progress to date

• Psyllidae slide collection (4000 slides)

• Two digitizers + SatScanner = 4 days

• Handling (not Imaging) is the bottleneck

• Solutions:

– More digitizers

– Crowd cropping of tray scans?

Page 25: Price2 ecn2013

Progress to date

• Theoretical maximum

– SatScan: 7000 slides per day (5-8 people)

– Other: 700 - 1000 slides per person per day

• NHM Entom collection = 10 – 15 person years

unloadimagelabel load

imagelabel load

unloadimage label load

unloadimage labelload

unloadimagelabel load

label load23

4

1

label

Page 26: Price2 ecn2013

Future Plans

• Specimen Imaging

– Type material

Page 27: Price2 ecn2013

Acknowledgments

Flavia

Johanna

Elisa

Peter

LyndseySara

Page 28: Price2 ecn2013

Questions?