![Page 1: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/1.jpg)
Mass Digitisation and Long-term Preservation – Processes and Production at Munich Digitisation Centre
Dr. Markus BrantlCopyright Bayerische Staatsbibliothek
![Page 2: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/2.jpg)
2
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Who invented Mass Digitisation?
Copyright Bayerische Staatsbibliothek
![Page 3: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/3.jpg)
3
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library (BSB) and the
Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects, examples
5. Outlook
Copyright Bayerische Staatsbibliothek
![Page 4: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/4.jpg)
4
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library (BSB) and the
Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects, Examples
5. Outlook
Copyright Bayerische Staatsbibliothek
![Page 5: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/5.jpg)
5
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Population: 12,5 million
7 Counties
8 towns > 100.000 Inhabitants
Capital city Munich (1,5 million)
245.000 university students
The Free State of Bavaria and Munich
Copyright Bayerische Staatsbibliothek
![Page 6: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/6.jpg)
6
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Bavarian State Library: Some Key Facts & Figures
Founded in 1558 by dukeAlbrecht V.680 employees43 million euro/annualbudgetInventory ~10 million volumes with an annualincrease ~ 150.000 volumes~1 million visitors in the reading room…
Copyright Bayerische Staatsbibliothek
![Page 7: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/7.jpg)
7
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Major Functions of BSBInternational research library of worldwide importanceCentral regional and archival library of the Free State of BavariaHead of the Bavarian Library Network (BVB) and Leader of the Bavarian Library ConsortiumOne of three partner of the „German Virtual National Library“ (together with Staatsbibliothek zu Berlin and Die Deutsche Nationalbibliothek in Frankfurt/Leipzig)
Copyright Bayerische Staatsbibliothek
![Page 8: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/8.jpg)
8
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
BSB as a Gateway to Cultural Heritage Resources
Collections: Timeframefrom 8th to 21th centuryManuscripts and Rare Books Collection of worldimportance
89,000 Manuscripts(number 4 worldwide)19,000 Incunabula (topposition in the world)130,000 16th centuryprints (largest collection in Germany)
Illuminated Reichenau Manuscripts part of the UNESCO Memory of the World since 2004
Copyright Bayerische Staatsbibliothek
![Page 9: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/9.jpg)
9
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
The Munich Digitisation Centre (MDZ)
1997 founded through the financing of the German Research Association as oneof two national digitisationcentres2003 integrated as divisionin the department of Acquisition, CollectionDevelopment and Cataloguing
MDZ todayNational competencecenter for digitisation and long-term preservationCentral innovation und digital production unit of BSB
Copyright Bayerische Staatsbibliothek
![Page 10: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/10.jpg)
10
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ Tasks within the Framework of „The Digital Library“
Research & Development Production Consulting (e.g. Workflows)Service Provider (e.g. scanning technology, Open-Source-Software for digitisation)
in(Retro-)DigitisationLong-term PreservationSubject Specific Portals
Copyright Bayerische Staatsbibliothek
![Page 11: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/11.jpg)
11
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ – Some Facts
Digitisation expertise in materials from 8th to 21th century
~32 M. files successfully processed= ~ 10 M. pages – 27,000 books= ~ 60 Terabyte of digital data stored and long-term
preserved in cooperation with Leibniz Supercomputing Centre (Munich-Garching)
= the largest operating long-term preservation archive in Germany
Complete production line from capture until long-term preservation based on own-developedSoftware ZEND
Comprehensive Hardwareequipment, e.g. for the processing of the Google Books Search imagecopies
Copyright Bayerische Staatsbibliothek
![Page 12: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/12.jpg)
12
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Examples of running R&D Projects at MDZ
Long-term Preservation
Development of organisational and business models for the long-term preservation of digital objects in Germany (German ResearchFoundation)BABS II – next step in deployment of practical long-term preservation (German Research Foundation) …
Digitisation IMProving Access to Text – Innovation in OCR
Development in new book scanning devices for fragile booksAutomated book scanners for 16th century books – developmentpartnership with theTreventus companyBook cradles
3D-Animation of books via WWW…
Copyright Bayerische Staatsbibliothek
![Page 13: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/13.jpg)
13
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Organisational structure 1
MDZ as Part of the Department of Acquisition, Collection Development and Cataloguing
Copyright Bayerische Staatsbibliothek
![Page 14: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/14.jpg)
14
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ Today's Organisation Unit in Detail
Copyright Bayerische Staatsbibliothek
![Page 15: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/15.jpg)
15
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ Staff & Financing
5 fulltime and 1 part-time established posts
70 full- and part-time temporary employees(44,5 FTE) financed by third-party grants from:
European UnionGerman Research Association (DFG)The Free State of Bavaria/Ministery of Science
Copyright Bayerische Staatsbibliothek
![Page 16: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/16.jpg)
16
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
www.munich-digitisation-centre.de
Daily Status
Copyright Bayerische Staatsbibliothek
![Page 17: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/17.jpg)
17
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library (BSB) and the Munich
Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and production
4. Projects - examples
5. Outlook
Copyright Bayerische Staatsbibliothek
![Page 18: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/18.jpg)
18
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Digitisation Strategy
Objectiveof retrodigital collection development
at BSB:
to digitize and make accessibly – free of charge –
all copyright-free library holdings
Copyright Bayerische Staatsbibliothek
![Page 19: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/19.jpg)
19
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
The Four Columns of
Copyright Bayerische Staatsbibliothek
![Page 20: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/20.jpg)
20
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Digitisation Strategy of the BSB
Third-party funds, Dig. On Demand, Conserv. Dig.
8th-16th centuryManuscripts, incunabula, special collections
Public-Private Partnership with(Google)
17-19th century
Third-party funds 20-21st century (new programmes in the context of German Digital Library approach)
Copyright Bayerische Staatsbibliothek
![Page 21: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/21.jpg)
21
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
What is Mass Digitisation?
Production of more than a million
pages?volumes ?
„Definition“ today (in Germany)production of more than a million pages
within a limited timeand tomorrow …?
Copyright Bayerische Staatsbibliothek
![Page 22: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/22.jpg)
22
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
What have we done before Mass Digitisation?
“Boutique”-digitisation Since 1997Small – medium sized projectsExplorative projects –learning and experimentingSelective picking of special objects under marketing aspects Deeper indexing of objectsManual work
Mass digitisation Started in 2006 with pilotLarge projetcs with 1 million pagesUsing technology and automation wheneverpossibleNo selection Dynamic indexing –starting at low level; deeper indexing when technically possible (e.g. Blackletter script)
Copyright Bayerische Staatsbibliothek
![Page 23: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/23.jpg)
23
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Mass Digitisation: Plans for Quantities?
German Research Foundation (DFG) target for digitisation of 16th.-18th. books
~1 M. titles =
~250 M. pages
Duration: Next 5-? years
Copyright Bayerische Staatsbibliothek
![Page 24: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/24.jpg)
24
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Current Mass Digitisation Projects at MDZProject Volumes Date Pages
VD16digital -1
~4.300 2006 -2008 ~ 1 M.
VD16digital -2
~37.000 2007-2009 ~ 7.5 M.
Inkunabeln ~ 9.000 2008-2012 ~ 1,6 M.
Private Public Partnership mit
> 1.2 M. 2008- > 300 M.
3 ScanRobots in the MDZ-Scancentre
Linux-Cluster of the MDZ for Processing the Google book copy
Copyright Bayerische Staatsbibliothek
![Page 25: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/25.jpg)
25
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Mass Digitisation: Main Challenges
Logistics: book pulling, transport, book-loan charging, quality management, return transport, book putting …Integration of metadataStorage and preservation – approx. 300,000 images per day Automated processing, e.g. automated book scanning devices…
Copyright Bayerische Staatsbibliothek
![Page 26: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/26.jpg)
26
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Mass Digitisation and Long-term Preservation
MDZ data production1997-2004 – over 3.000 CD-ROMs2004 migrated from CD-ROMs to tapes at Leibniz-Computing Centre2005 Start of the newlybuilt Scan-Centre fast growing archive
New challenges: ScanRobots and Googles output – more and moreTerabytes
Copyright Bayerische Staatsbibliothek
![Page 27: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/27.jpg)
27
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Processes and Production Cycle
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 28: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/28.jpg)
28
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Processes and Production Cycle
Copyright Bayerische Staatsbibliothek
![Page 29: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/29.jpg)
29
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
PreparationSelection of all relevant titles for digitisation project from the collectionsBlock the selected items for other forms of use (lending order e.g.)Printing of order-forms for collecting the relevant items from the shelvesChecking the scanability of the selected items under conservational aspectsRegister the reason for non-scanability for every respective item
Correction of the catalogue if necessaryPrinting scanning order forms with Digital Identifier
Check if the books for scanning fit to the metadata of the selected title
Delivery of the books of one charge together with scanning order forms to the Scan Centre
… Scanning …Take the books back from the Scan Centre and check the completeness and actual condition of the booksPutting the books in the shelvesUnblocking lending orders
…
Copyright Bayerische Staatsbibliothek
![Page 30: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/30.jpg)
30
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 31: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/31.jpg)
31
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Handling of Different Material Types
BSBs case: Wide range of materials from 8th-21th century
ManuscriptsIncunabulasEarly and rare printsHistorical Maps Photographic material (slights; glass plates etc.)MonographsNewspapersMagazines…
Copyright Bayerische Staatsbibliothek
![Page 32: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/32.jpg)
32
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Handling: Central Risks
Damages byManually treatment of the originalsRoom climate, e.g. during transport and reproductionLight
Copyright Bayerische Staatsbibliothek
![Page 33: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/33.jpg)
33
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Handling: Damages by Opening Angle of 180°
120°
opening angle spine streched, but
fold still okay
The same book:180°
opening angle fold broken
Copyright Bayerische Staatsbibliothek
![Page 34: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/34.jpg)
34
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Books Handling during Reproduction
BSBs Goal: Reduction of central risk factors of damageNo fixation during re-production under glass-plateDeformation of books
bodyClose cooperation with the Institute for Book and Manuscript Restoration at BSB
Cooperation in the selectingprocess of scanning devicesObligartory use ofCool ligthingOf different bookcradles for gentle book handling
Disadvantage: Higher set-up time
Copyright Bayerische Staatsbibliothek
![Page 35: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/35.jpg)
35
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Processes and Production Cycle
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 36: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/36.jpg)
36
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
The MDZ Scan-CentreEspecially for books and special collectionmaterials before 1800(Modern books in outsourcing) 2005 developed fromreorganisation of the anlaogue reproductionservices
until2004
Since 2005
Copyright Bayerische Staatsbibliothek
![Page 37: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/37.jpg)
37
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Scan Facilities14 bookscanners
3 automatedbookscanners, „ScanRobots“
2 special bookscannersfor rare manuscripts and treasure holdings -„Grazer camera table“
Formats up to ~0,85 * 1,18 meters (DIN A0)
High resolution up to 600 ppi
Copyright Bayerische Staatsbibliothek
![Page 38: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/38.jpg)
38
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Image Scanning: Our Philosophy„Do it right the first time“
scanning at the best possible qualityReuse mulitple
Facsimile, reprintWWWDigitisation on Demand…
ImagesDigital master files: TIFF Derivates for WWW-Presentation: JPEG and PDF
Extensive automation of workflows
Copyright Bayerische Staatsbibliothek
![Page 39: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/39.jpg)
39
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Image Quality and Scanning Parameters
High resolution digitisation in relation to the original size ofManuscripts, early printed books, historical Maps, special materials
colour (24 Bit)400 up to 600 ppi Digital Master: TIFF uncompressedUse of
Colour Management Software (ICC-Profiles)Size – and Colour Targets
Modern prints of 19th and 20th centuries in dependency of the original
Text only: black and white (1Bit)Text/images: 1 Bit and grayscale (8 Bit) and/or colour300 up to 600 ppi ( 600 ppi only 1 Bit and TIFF-G4 compressed)
Image storage size between 100 Kilobyte up to 800 Megabyte per image
Detail pictures (same image enlarged ) In 24 Bit, 1 Bit, 8 Bit
Copyright Bayerische Staatsbibliothek
![Page 40: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/40.jpg)
40
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Scanning Output: Examples
Copyright Bayerische Staatsbibliothek
![Page 41: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/41.jpg)
41
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 42: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/42.jpg)
42
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Digital Long-Term PreservationStill in experimental phase, no allround solution
Different approachesMigrationEmulationComputer museum
In cooperation on national and international levelnestor - the german Network of expertise in Digital long-term preservationInstitute of Informatics - University of the German Federal Armed ForcesEuropean UnionUSA
MDZ focuses on Digital long-term perservation in practice with the Leibniz Computing CentreTrusted Repositories Certification for long-term archives
Copyright Bayerische Staatsbibliothek
![Page 43: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/43.jpg)
43
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ:Milestones in long-term preservation1999: First project concerning removable media – testbedof a computer museum (Institute of Informatics, GFAF)
2004: 1. Migration of CD-ROM stored digitisation data to a large-scale archiving system of the Leibniz Supercomputing Centre (LRZ) in Munich (IBM Tivoli Storage Manager)
2005: automatical storage procedures are implemented, connecting the storage/archiving system of the LRZ and MDZ-production infrastructure (ZEND-Software)
Daily online transfer of new dataSecure storage of multiple copiesLong-term preservationRapid restore functions
Copyright Bayerische Staatsbibliothek
![Page 44: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/44.jpg)
44
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
2 x IBM TS3500 (3584)
20 LTO II drives35 MByte/s transfer rate 5,000 tape slots200 GByte per tape990 TByte total capacityMaximum possible capacity: 6.900 tape slots, 1,400 TByte
STK SL850016 titanium drives120 MByte/s transferrate4,900 tape slots500 GByte per tape2,400 TByte total capacity
(eq. 3,8 M CDs)Maximum possible capacity: 300.000 tape slots, 146,000 TByte
Long-term Preservation: The Archival System of the LRZ
Copyright Bayerische Staatsbibliothek
![Page 45: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/45.jpg)
45
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
MDZ: Increase of Long-Term Preserved Data
0
10.000.000
20.000.000
30.000.000
40.000.000
50.000.000
60.000.000
Jan 0
5Mrz
05Mai
05Ju
l 05
Sep 05
Nov 05
Jan 0
6Mrz
06Mai
06Ju
l 06
Sep 06
Nov 06
Jan 0
7Mrz
07Mai
07Ju
l 07
Sep 07
Nov 07
Jan 0
8
Expected annual increase: at last 100-150 Terabyte
Copyright Bayerische Staatsbibliothek
![Page 46: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/46.jpg)
46
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Copyright Bayerische Staatsbibliothek
![Page 47: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/47.jpg)
47
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 48: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/48.jpg)
48
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Indexing – Metadata CreationMetadata will needed for
Search and RetrievalNavigation/Browsing
Creation of1. Bibliographical (descriptive) Metadata = from OPAC2. Structural metadata = stage model for indexing a book
a) Only Images b) Image and text of the table of contentsc) Images, text of the table of contents and indicesd) Images and „hidden text“ (with errors) e) Complete, layout-proof , nearly error-free text
3. Technical MD = How the data were created?4. Administrative MD = When the data were changed? By
whom?
Copyright Bayerische Staatsbibliothek
![Page 49: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/49.jpg)
49
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Standards for Document formats and Character Set
XML as data exchange format with
Metadata Encoding Transmission Standard (METS) Text Encoding Initiative (TEI)
Character setUnicode, UTF-8 encoding
Copyright Bayerische Staatsbibliothek
![Page 50: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/50.jpg)
50
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 51: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/51.jpg)
51
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Building the WWW-Interfaces
In dependence of projects requirements and the depth of indexing for
Search and Retrieval
Navigation/Browsing
Usability
Accessibility
Copyright Bayerische Staatsbibliothek
![Page 52: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/52.jpg)
52
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Examples of Different Online Publications
Copyright Bayerische Staatsbibliothek
![Page 53: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/53.jpg)
53
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 54: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/54.jpg)
54
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Proofs of Digitisted Items in Catalogues, Subject Gateways ...
Copyright Bayerische Staatsbibliothek
![Page 55: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/55.jpg)
55
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 56: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/56.jpg)
56
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Reuse
Digital Master files at high resolution
Cross media publishing Facsimilie and ReprintExhibiton catalogues…
Document delivery
Deeper indexing at anytime
3D
Copyright Bayerische Staatsbibliothek
![Page 57: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/57.jpg)
57
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Reuse: Example 3D-Animation with an Book of Arms
Testbed in collaboration with
Copyright Bayerische Staatsbibliothek
![Page 58: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/58.jpg)
58
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Last but not least: Quality Control
Everywhere in production at defined points
But: 100% accuracy from „Boutique“-digitisation times is unrealistic for massdigitistion (only spot checks)
New approaches, e.g.checking page number sequence with OCR duringscanning for completnessInvolving the users in QC by offering a comment form with each image
Problem: Each reprocessing increases costs
Copyright Bayerische Staatsbibliothek
![Page 59: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/59.jpg)
59
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Software
c) Example of a Production Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 60: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/60.jpg)
60
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND Production SoftwareZEND= Zentrale Erfassungs- und NachweisDatenbank[central acquisition and proof database]
ZEND developed by MDZ since 2003
Electronic Publishing System with modules forDocument-ManagementWeb-Content-ManagementWorkflow-ManagementLong term preservation of digital media
Based on Open-Source-SoftwareLinuxApacheMySQLPHP (bzw. Perl)Cocoon– XML-Publishing Framework von Apache …
Flessible (for different media types)Scaleable and expandable (e.g. Cross Server Architecture)
Copyright Bayerische Staatsbibliothek
![Page 61: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/61.jpg)
61
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND-GoalsMapping of the entireproduction cycle and itsprocesses in a modular systemDifferent service providers(scanning, text capture) can supply unlimited data to ZENDWorkflow-controlEvery item of the BSB, which will be digitised, follows only the ZEND-workflowTime and costreduction through extensive automation
Copyright Bayerische Staatsbibliothek
![Page 62: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/62.jpg)
62
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND-Main FeaturesWeb-based user interface and administrationSpecific job controlCollaborative workingImport of scanned books processed by different, unlimitedservice providersAutomatic URN allocation with linkresolving via German National LibraryOAI Data ProviderBibl. metadata-Import data via interface Z.39.50XML support for all common standards (for example, TEI, METS) Rights management: inhouse digital reading room or WWW publicationLong-term archivingSearch and RetrievalOCR server (works only for latin print type)PDF on Demand …
Copyright Bayerische Staatsbibliothek
![Page 63: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/63.jpg)
63
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND – Automated ProcessesFull-automatic processes
Collection of image data from different serviceprovidersImage conversionFast supply of an simple working turning-pagesversion of the book (XML)Regular bibliographical data updates (ZEND from the OPAC) OCRLong-term archiving of master files at LRZ and subsequent deleting of master-fiels on the MDZ-servers…
Semi-automatic processesLoading of large amounts of bibliographic metadata fromcatalogueRe-import lists of the URNsin the catalogueImport of prepared Excel-sheets for the creation of ToCs in ZENDAllocation of page numbersto digitised books (for navigation)Transfer of digital masterback from the long-termarchive…
Copyright Bayerische Staatsbibliothek
![Page 64: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/64.jpg)
64
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND-Layers
Capture, Content-Management, Indexing
Publication
Services
Workflow-Management
Data-Management
Copyright Bayerische Staatsbibliothek
![Page 65: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/65.jpg)
65
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library
(BSB) and the Munich Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
a) Production Cycle and Subprocesses
PreparationHandling of OriginalsScanning Storage and Long-termPreservationMetadata CreationOnline Publication and Data ManagementProofReuse
b) ZEND Production Softwarec) Example of a Production
Workflow with ZEND
Copyright Bayerische Staatsbibliothek
![Page 66: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/66.jpg)
66
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Overview: ZEND-Production
Production System
o Digitisation on Demando Project-oriented Digitisationo Conservational Digitisation
Zahlbar an $
Order
Exchange of metadate
Inhouse-onlypublication
Online Publication
Webpublication
Archival Storage
Definitive file name
URN
Digitised Object
Catalog (OPAC)
Network of Bavarian Libraries
Digitisation
Portals(BLO, ZvDD, Chronicon,...)
Search engines
OAI
URN-Resolving(XEpicur)
DNB ZEND
Administration of all metadata (bibliographic, technical, administrative)Automatic image processing
1
7
2 2
3
6
4
4
8
53
5
Copyright Bayerische Staatsbibliothek
![Page 67: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/67.jpg)
67
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Capture/Management (1)
withZEND-Services
Copyright Bayerische Staatsbibliothek
![Page 68: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/68.jpg)
68
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Capture/Management (2)
Result of order form: Process slipwith barcode
Copyright Bayerische Staatsbibliothek
![Page 69: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/69.jpg)
69
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Capture/Creation of File Name and URN (3)
Automated assignment with creation of the process slipOf the definitive file name
Example: bsb00001119_00001.tif
and at the same time
Assignment of a persistent identifier by use of the National Bibliography Number
Example : urn:nbn:de:bvb:12- bsb00001119
Assignment is done locally, but the administration of the URN and the central link resolving is part of the duties by the DNB: http://nbn-resolving.de/urn/resolver.pl?urn= urn:nbn:de:bvb:12- bsb00001119
Copyright Bayerische Staatsbibliothek
![Page 70: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/70.jpg)
70
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Capture/Management–3Books in process list:status
Copyright Bayerische Staatsbibliothek
![Page 71: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/71.jpg)
71
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Indexing (4)
Import of bibl.Metadata
Copyright Bayerische Staatsbibliothek
![Page 72: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/72.jpg)
72
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Indexing (5)
Import of bibl.Metadata viaZ39.50
Copyright Bayerische Staatsbibliothek
![Page 73: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/73.jpg)
73
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: After Scanning and Import of the Digital Master Files of the production System (6)
Creation of an index file for the online publication of the imagesAutomatic production of image formats for the web presentation (JPG, PDF etc.)Creation of the browsing structure for the object (ToC-Editor)OCR processing of the images (optional)Storage of all the data in the Leibniz Supercomputing Centre for long-termpreservation
Copyright Bayerische Staatsbibliothek
![Page 74: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/74.jpg)
74
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Indexing after Import (7)
URN – National bibliography number
Copyright Bayerische Staatsbibliothek
![Page 75: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/75.jpg)
75
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Indexing - ToC Editor (8)
Adding structuralinformation andenablingWWW-access
Copyright Bayerische Staatsbibliothek
![Page 76: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/76.jpg)
76
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Immediate availability of the link inside the local catalogue (OPAC) and the Bavarian Union Catalogue
Copyright Bayerische Staatsbibliothek
![Page 77: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/77.jpg)
77
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ZEND: Final Online Publication (9)
Copyright Bayerische Staatsbibliothek
![Page 78: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/78.jpg)
78
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library (BSB) and the Munich
Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
Copyright Bayerische Staatsbibliothek
![Page 79: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/79.jpg)
79
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Current Mass-digitisation Projects at MDZProject Volumes Project
periodPages
VD16digit al-1
~4.300 2006 - 2008
~ 1 M.
VD16digit al-2
~37.000 2007- 2009
~ 7.5 M.
Inkunabeln ~ 9.000 2008-2012 ~ 1,6 M.
Private Public Partnershi p mit
> 1.2 M. 2008- > 300 M.
3 ScanRobots in the MDZ-Scancentre
Linux-Cluster of the MDZ for Processing the Google book copy
Copyright Bayerische Staatsbibliothek
![Page 80: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/80.jpg)
80
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
„VD16-1 and VD16-2 digital“ -Projects
Goal: Online Publication of all 16th century books which are unique in the inventory of BSB
Copyright Bayerische Staatsbibliothek
![Page 81: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/81.jpg)
81
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
1501-1517( „VD16-1“)
Pilot project for massdigitizationDuration: 2006-2008 Manual scanning withbook scanners (A2) Digital Master with 300-400ppi, 24 bit, TIFF withICC profiles, uncompressed4,300 ~ ~ 20 terabytes of digital master data Deeper IndexingProcessing from Captureto long-term preservationwith ZEND
Cooperation partner for long-term storage and archiving Leibniz Supercomputing CentreDuration: 2007-2009Scanning with automaticbook scannersDigital Master with300ppi, otherwise likeV16-137000 titles = ~ 175 terabytes of digital masterZEND processing
1518-1600( „VD16-2“)
Copyright Bayerische Staatsbibliothek
![Page 82: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/82.jpg)
82
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Scanning Difficulties in the „VD16-1“-Project
Conservational requirements: Only 70% of books could be opened up to 90 °Scanning with „normal bookscanners“Need for book cradle support: only one-sided scanningpossible (1 Scan / click = 1 page) 3 process steps
1. Scanning all left pages2. Scanning all right pages3. Assembling left and right pages
Due lack of pagination in 16th century books led to higher error rate
Reduction of throughput
Copyright Bayerische Staatsbibliothek
![Page 83: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/83.jpg)
83
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Scanning Throughput in the „VD16-1“-Project
Average rate 3 books per Scanner / 8 h-working day
Increasing set-up time due conservational requirements
Set-up time = Adjustment of the book before the first scanScanning of the color/sharpness targetsAdjusting of sharpness during the scan-process (only for scanning without glass-plate)Integration time – writing the data from the CCD/firmwareto harddiskShut-down of the book
Set-up time is changing from object to object
Copyright Bayerische Staatsbibliothek
![Page 84: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/84.jpg)
84
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
The Way to the ScanRobot
Objective: optimisation of throughput for the scanning of early and rare prints with limited opening angle 2006 - Market evalution for automatic bookscanners (hardware and software) and tenderDevelopment partnership withMilestones
July 2007: 2 prototypes in the BSBJanuary 2008: start of productionMarch 2008: third robotApril 2008: Start of 4-shift operation from 07.00 a.m. to 11.00 p.m.
Copyright Bayerische Staatsbibliothek
![Page 85: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/85.jpg)
85
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
„VD16-2“: ScanRobot-Throughput
Status now: 10 books per ScanRobot / 8 hour workingdayTotal weekly production: 300 books ~ 75,000 pages ~ 2,2 Terabyte
Interim result: The objective of a significantincrease in production is achieved
Next target - optimisation of the book cradle : 500 + x titles per week
Copyright Bayerische Staatsbibliothek
![Page 86: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/86.jpg)
86
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
ScanRobot in ActionScanning 16th centuries books with automated bookscanners
Copyright Bayerische Staatsbibliothek
![Page 87: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/87.jpg)
87
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Digitisation in the GBS- programme
By Google and at the expenses of Google
Google Digital CopyIntegration in Google‘s services
Library Digital CopyIntegration in the BSB services
Copyright Bayerische Staatsbibliothek
![Page 88: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/88.jpg)
88
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Mass Digitisation
No „Selective Picking“: no prioritisation of textcorpora, shelf numbers, material types etc.
Logistics determines procedures and tempo of digitisation(Trucks, Carts, Operating Conditions, Shifts etc. etc.)
Selection focusses strictly on conservation purposes, formats or Copyright-issues
Continuous optimisation of Google‘s Scanning technologies („If it doesn‘t work today, it‘ll do tomorrow“)
In principle: If Google cannot digitise for technical orconservational reasons, the MDZ will do
Copyright Bayerische Staatsbibliothek
![Page 89: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/89.jpg)
89
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Some Workflow-FacetsOrganisation of a workflow with many departments for the digitization of up to 1,000 books and more on the day
Guaranteeing the quality of the workflow and itsoutcome: Track every book in every step of the process
Collection and documentation of process data (Non-scanable marks, comments…)
The digitisation process shouldn‘t disturb the normal library processes
Copyright Bayerische Staatsbibliothek
![Page 90: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/90.jpg)
90
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
:The Role of MDZ MDZ as part of a BSB-wide (all departments involved) project organisation
Responsible for the processing of the „Digital Library Copy“ from data import to online-Publication
Processing software: a special application of ZEND= „Google-ZEND“
Storage und long-term preservation in cooperation withLeibniz Supercomputing Centre
Copyright Bayerische Staatsbibliothek
![Page 91: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/91.jpg)
91
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Google-ZEND in a Nutshell
Copyright Bayerische Staatsbibliothek
![Page 92: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/92.jpg)
92
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
DLC – Preview of Online Publication by MDZ
Coming soon …
Copyright Bayerische Staatsbibliothek
![Page 93: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/93.jpg)
93
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Agenda1. The Bavarian State Library (BSB) and the Munich
Digitisation Centre (MDZ)
2. BSB Digitisation Strategy and the Challenges for Mass Digitisation and Long-Term Preservation
3. Processes and Production
4. Projects
5. Outlook
Copyright Bayerische Staatsbibliothek
![Page 94: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/94.jpg)
94
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
OutlookMDZ is just in the initial phase of mass digitisation
Long-term preserved data can be processed in the futurewith new technologies
Needs for further development and automation inOCRIndexing – Text MiningQuality Control
More online available digital content allows deeper linking between different repositories
There are still a lot of challenges …
Copyright Bayerische Staatsbibliothek
![Page 95: Mass digitization and long-term preservation – processes ...€¦ · 20 Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008 Digitisation Strategy of the BSB Third-party](https://reader030.vdocuments.mx/reader030/viewer/2022040410/5ec9e8e9910d163d675d4d52/html5/thumbnails/95.jpg)
95
Dr. Markus Brantl Goethe Institute - Singapore/Jakarta June 2008
Thank you for your attention!
Contact:markus.brantl@bsb- muenchen.de
Copyright Bayerische Staatsbibliothek