digitisation on demand arcadia seminar

50
Digitisation and Print-on-demand ARCADIA Ed Chamberlain - Systems Development Librarian

Upload: edmund-chamberlain

Post on 19-May-2015

917 views

Category:

Technology


0 download

DESCRIPTION

For his Michaelmas 2010 Arcadia fellowship, Ed Chamberlain investigated ways to speed up the digitisation process in academic libraries. He identified three problem areas and explored issues surrounding corresponding potential solutions, including automated book scanners and the Espresso print-on-demand machine. The seminar will recount his findings, and provide an opportunity to discuss how libraries can successfully interface with innovative technologies. About the Speaker: Ed Chamberlain works as Systems Development Librarian at Cambridge University Library His library career so far has spanned three sectors, including Oxford, the London Library and the Natural History Museum. Here, Ed was involved in the early creation and development of online services based around digitised materials, including the Bio-Diversity Heritage Library mass-digitisation project. He has a BA in Politics from the University of East Anglia and an MA in Library and Information management at Loughborough University. Ed took up his current position in 2007 and has taken a lead in the redevelopment of online services and systems supporting both electronic and print library resources. Ed's professional interests include all aspects of online library and information services, especially web design trends and underlying software architecture. He is also interested in new standards of metadata, including emerging semantic web based services and open publishing models for both data and content. Please email your intent to attend to Michelle Heydon, [email protected] This talk is part of the Arcadia Project Seminars series. Date: Tuesday 3rd May 2011 Time: 18:00-19:15 - refreshments from 17:45 Venue: Old Combination Room (OCR), Wolfson College

TRANSCRIPT

Page 1: Digitisation on demand arcadia seminar

Digitisation and Print-on-demand

ARCADIA

Ed Chamberlain - Systems Development Librarian

Page 2: Digitisation on demand arcadia seminar

Question:

How could we (better) automate the digitisation workflow?

Page 3: Digitisation on demand arcadia seminar
Page 4: Digitisation on demand arcadia seminar

Why is digitisation important to libraries?

1. Better expose existing collections to a wider audience

2. Better meet reader expectation of ‘everything online’

3. Preserve material

Page 5: Digitisation on demand arcadia seminar

Why is it important to me?

1. Previous work on the Biodiversity Heritage Library project

2. I feel that libraries are still not fulfilling their tremendous potential here

3. Cambridge has no ‘Google books project’• Is there an alternative model?

4. Is a one-site UL sustainable forever?

Page 6: Digitisation on demand arcadia seminar

What's’ happening now in Cambridge? Digitisation is focused on special collections

Limited funds Cambridge's USP!

Relatively slow, manual process done to exemplar standards

Not scalable (at an effective cost)

Page 7: Digitisation on demand arcadia seminar

As it stands …

Page 8: Digitisation on demand arcadia seminar

Barriers to digitisation

Barriers are not technology-centric

1. Copyright legislation

2. Cost / time

3. Difficulty in reading on a screen

Page 9: Digitisation on demand arcadia seminar

Areas of investigation …

Examine technological responses to barriers:

1. Copyright legislation => Speed up / rationalise copyright analysis

2. Cost /time => Explore automated book scanning

3. People prefer to a book to a screen => Explore print on demand

Page 10: Digitisation on demand arcadia seminar

What exactly?

1. Copyright legislation => Speed up the copyright analysis => COPYRIGHT CALCULATOR

2. Cost / time => Explore automated book scanning =>KIRTAS AUTOMATIC BOOK SCANNER

3. People prefer to a book to a screen => Explore print on demand => ESPRESSO BOOK PRINTING MACHINE

Page 11: Digitisation on demand arcadia seminar

Focus of fellowship …

1. COPYRIGHT CALCULATOR

2. KIRTAS AUTOMATIC BOOK SCANNER

3. ESPRESSO BOOK PRINTING MACHINE

… investigate them as a basis for a potential ‘on-demand’ digitisation service

Page 12: Digitisation on demand arcadia seminar

Imagine …

Full or partial digitization of a work instead of a stack request initiated from a catalogue

Straight to desktop in less than a day

Order a bound print copy as an option

If it’s a public domain work then made available for all, under Creative Commons License…

Page 13: Digitisation on demand arcadia seminar

Full digitisation at readers’ request …

Page 14: Digitisation on demand arcadia seminar

Why ‘On-Demand’?

Expectation of modern society

Self sustaining – if reader pays for cost of digitisation -no large external donor needed

‘Every book its reader’

‘Save the time of the reader’

Page 15: Digitisation on demand arcadia seminar

Fellowship methodology …

Explore each area in turn …

Visit case studies

Assemble facts and figures where possible

Draw out advantages and disadvantages of each piece of technology

Page 16: Digitisation on demand arcadia seminar

1) Copyright and

copyright calculation

Page 17: Digitisation on demand arcadia seminar

Basic problems with copyright:

Fiendish stuff

Complexity slows down decisions

Upsets risk-averse Librarians

We can only fully digitise what is in the Public Domain

Page 18: Digitisation on demand arcadia seminar

Scope for automation

Copyright legislation as a set of rules into which data about a work is fed

Out comes a result (yes/ no/ probably)

Sounds like a job for a machine, rather than a person …

Page 19: Digitisation on demand arcadia seminar

… Exactly what others have thought

Open Knowledge Foundation - Public domain works project / Europeana

Now exists as a machine accessible API

Feed in bib data - get a results

Page 20: Digitisation on demand arcadia seminar
Page 21: Digitisation on demand arcadia seminar
Page 22: Digitisation on demand arcadia seminar

Conclusions on copyright calculation

Out of the 100 samples, 76 returned an expected result given the data available (further 8 could have been useful if a safe cut off point was added)

Great technology to potentially assist in decision making

As useful in asserting what is not in the public domain, as opposed to what is

Data we can provide is incomplete for the task – sometimes further research will be needed

Great feature for a library catalogue - kick off an ordering process

Page 23: Digitisation on demand arcadia seminar

2) Digitisation-on-demand

Page 24: Digitisation on demand arcadia seminar

Not that difficult to copy a book quickly…

Page 25: Digitisation on demand arcadia seminar

Why Kirtas?

Two in Cambridge at the press

Used in the Cambridge libraries Collections project

CUP let me take a look!

Page 26: Digitisation on demand arcadia seminar

Kirtas video …

http://www.youtube.com/watch?v=V03s5oJDwwc

Page 27: Digitisation on demand arcadia seminar

Automated page turning …

But with a human watching just in case …

Cost saving?

Still quicker than ‘by hand’

Page 28: Digitisation on demand arcadia seminar

Automated post processing …

But images are also sent to India for a two week tidy-up

Quick enough for on-demand?

Page 29: Digitisation on demand arcadia seminar

What level of quality is sufficient for a library surrogate?

Focus on improving access rather than preservation

Would a preservation quality image be too expensive to produce for an on-demand approach?

For the iPad and Kindle - text is as important as a scanned image

Page 30: Digitisation on demand arcadia seminar

Demand for this kind of thing?

91% (56/61) of Cambridge academics surveyed would be interested in a full text digital copy of an out-of copyright work

62% (36/58) would be interested in a partial digital copy of an in-copyright work if available

Page 31: Digitisation on demand arcadia seminar

What can we copy?

Pub. Date Items % PD No. PD

1400-1850 304,587 100 304,587

1850-1860 40,970 100 40,970

1860-1870 43,734 100 43,734

1870-1880 50,564 95 48,035

1880-1890 66,857 90 60,171

1890-1900 66,883 85 56,850

1900-1910 70,360 65 45,734

1910-1920 60,489 40 24,195

1920-1930 78,670 25 19,667

1930-1940 90,576 10 9,057

1940-1950 72,692 6 4,361

1950-1960 118,251 0 0

1960-1970 262,974 0 0

1970-2009 2130,509 0 0

Total 3458,116 19 657,361

Estimations of University of Cambridge holdings within the public domain. R.Pollock 2009

Page 32: Digitisation on demand arcadia seminar

What can we copy?

Around 19% of CUL’s collections fall within the public domain

Niche interest in this area - 2% of circulation transactions affected material from 1850 -1920

Page 33: Digitisation on demand arcadia seminar

How much does it cost?

Cheaper than current services … Imaging option: Photocopy/Scan Image Type: A4 300 dpi (pdf) Image production (350 images at 0.50): 175.00 Service charge (15%): 26.25 VAT (20%): 35.00 Total: £236.25 for 350 pages

But still not that cheap… About £30 for a 350 page work (cost modelling based around the Kirtas

manned by imaging services staff) No capital recoup in that model

Page 34: Digitisation on demand arcadia seminar

How much would readers pay?

Survey information reveals that 66% (36/54) academic users would prefer to pay under £15 for a digitised copy

Achieving this at cost or with a small surplus would be a challenge

Attempting to recoup capital investment directly would push costs beyond a ‘sweet-spot’ price point

Should they have to pay at all?

Page 35: Digitisation on demand arcadia seminar

Conclusions for digitisation-on-demand

Great technology, nice idea, some demand

Somewhat limited as an effective service by size of public domain

Large upfront costs if Kirtas purchased

Other cost models available (lease hire, outsource)

Page 36: Digitisation on demand arcadia seminar

3) Print-on-demand

Page 37: Digitisation on demand arcadia seminar

Print on demand

Nothing new for publishing

Espresso Book Machine is the most exciting thing out there

Page 38: Digitisation on demand arcadia seminar

EBM video

http://www.youtube.com/watch?v=Q946sfGLxm4

Page 39: Digitisation on demand arcadia seminar

Blackwells Experience

Lots of interest

Needs full time staff to run

Strong interest in self publishing (theses)

Increasing amounts of material available from a variety of sources (Project Gutenburg, Google Books, publishers)

Page 40: Digitisation on demand arcadia seminar

Utah Experience

“It undermines the need for traditional subject selection, disrupting a major sub-discipline of librarianship. By doing so, it also undermines the rationale for a large research collection—if the purpose of the collection is to meet patrons’ information needs, and if they can now be met without buying and housing a large just-in-case collection, then how do we defend the unbelievably expensive and arguably quite wasteful practice of traditional collection building?”

Rick Anderson, Marriott Library University of Utah

Page 41: Digitisation on demand arcadia seminar

Utah Experience

“Undermines the need for publishers to print speculative runs of new books, thus potentially changing in a drastic way the logistics of the publishing world. In a rational marketplace, every bookstore would have an EBM or something that works on the same principle, and books would only be printed at the point of demand and purchase”

“Obviously, its full potential has yet to be realized—but the fundamental model is now in place. What are left to fix (bad metadata, incomplete catalog, rights issues, etc.) are the details. In most cases, fixing them will require only money and effort, and as roadblocks go those are relatively simple ones”

Rick Anderson, Marriott Library University of Utah

Page 42: Digitisation on demand arcadia seminar

Demand?

65% would also be interested in a print facsimile

42% of academic respondents would be willing to pay £10-£15, 33% £15-£25

Page 43: Digitisation on demand arcadia seminar

Costs?

£10 per 350 page volume

Blackwells have a pricing model that does not recoup capital

Page 44: Digitisation on demand arcadia seminar

Final thoughts …

Page 45: Digitisation on demand arcadia seminar

Conclusions for both print and digitisation

High upfront cost – any model that attempts to recoup capital through charges prices itself out of market

High upfront cost – High risk of failure

‘Innovators dilemma’ - we are in effect in competition with our bread and butter services

Page 46: Digitisation on demand arcadia seminar

Conclusions for both print and digitisation

Aiming to hit a moving target of user expectation

Danger of early adoption – not understanding or being aware of longer term issues (ejournals)

Page 47: Digitisation on demand arcadia seminar

Conclusions for both print and digitisation

Demand is high

Breakthrough technology – getting cheaper

Page 48: Digitisation on demand arcadia seminar

Are libraries loosing digital customers by playing fair?

Google continues to digitise, despite legal setbacks and gain the headlines

Users continue to digitise themselves… Privately in research groups ‘Socially’(http://library.nu/ and other academic

torrent sites)

Many in academia now chose to ignore or challenge inflexibilities of copyright to get the material they need

Page 49: Digitisation on demand arcadia seminar

Remove barriers - Make it easier to get material people need for free for them (or cheaply)

Lower costs – new approaches, new models of working

How could we respond?

Page 50: Digitisation on demand arcadia seminar

Ed Chamberlain

[email protected]

@edchamberlain

This work is licensed under a Creative Commons Attribution 3.0 Unported License.