digitization fundamentals: text
TRANSCRIPT
![Page 1: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/1.jpg)
Digitization Fundamentals:
TextLaura Weakly
![Page 2: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/2.jpg)
Do you need to digitize?
![Page 3: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/3.jpg)
Has it already been
digitized? Can you use it?
![Page 4: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/4.jpg)
What is the source? What
do you want to do with it?
![Page 5: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/5.jpg)
Resources on campus
• Media Services (221 Love Library South)
flatbed, microform, large format, 3D
• Geology Library (10 Bessie Hall)
• New Media Center (116 Architecture Hall)
• Pixel Lab (123 Henzlik Hall)
![Page 6: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/6.jpg)
Archival Standards
• National Archives and Records Administration
http://www.archives.gov/preservation/technical/guideline
s.pdf
• Library of Congress
http://memory.loc.gov/ammem/about/techStandards.pdf
![Page 7: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/7.jpg)
LOC Chart
![Page 8: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/8.jpg)
What does this mean?
• Pixels, Resolution, Bit Depth, DPI
• File Formats, Compression
Canadian Heritage Information Network
http://www.pro.rcip-chin.gc.ca/cours-
courses/fondamentales_numerisation-
digitization_fundamentals/index-eng.jsp
![Page 9: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/9.jpg)
File naming & organizing
• Think about your naming convention BEFORE digitizing
• Be descriptive, but not too descriptive
• If using dates, standardize for computer sorting
YYYYMMDD
• Use leading 0s
• Rename, if necessary (Automator on Mac)
![Page 10: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/10.jpg)
File naming examples
Too much (above);
Too little (right)
![Page 11: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/11.jpg)
File naming examples
![Page 12: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/12.jpg)
Metadata
• Record in spreadsheet (Excel, GoogleDoc)
• Title, Subject, Description, Creator, Source, Publisher, Date, Contributor, Rights, Relation, Format, Language, Type, Identifier, Coverage
• ScanFileName, Title, Subject, Description, Creator, Publisher, PubPlace, Contributors, OriginalDate, Type, Format, OriginalSize, Source Identifier, CollectionTitle, CollectionCreator, Copyright Ownership, ScanResolution, ScanDate, Publisher, Gray/RGB, ScanningNotes, ManipulationNotes, Operator, Scanner/ColorBar, MasterFileLocation
![Page 13: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/13.jpg)
Metadata
• Dublin Core
http://omeka.org/codex/Working_with_Dublin_Core
• Text Encoding Initiative
http://www.tei-c.org/release/doc/tei-p5-
doc/en/Guidelines.pdf
• Encoded Archival Description
http://www.loc.gov/ead/
![Page 14: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/14.jpg)
Optical Character
Recognition (OCR)
• Transcription
• ABBYY Finereader http://www.abbyy.com
• OmniPage http://www.nuance.com/for-business/by-
product/omnipage/standard/index.htm
• Tesseract https://code.google.com/p/tesseract-ocr/
![Page 15: Digitization Fundamentals: Text](https://reader030.vdocuments.mx/reader030/viewer/2022012417/6171dc8da779bc4217295c69/html5/thumbnails/15.jpg)
Thank you!