texas newspaper pdf preservation: a low-cost solution with tremendous value ana.krahmer@unt.edu ana...

Post on 15-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Texas Newspaper PDF Preservation: A Low-Cost Solution with Tremendous Value

ana.krahmer@unt.edu

Ana Krahmer, Digital Newspaper Program Coordinator

Mark Phillips, Assistant Dean of Digital LibrariesUniversity of North Texas Libraries

Presented August 20, 2014, for the World Library and Information Congress,

Session 170

Overview

• What is TDNP?• Current PDF Newspapers• The Texas Press Association Archive• Technology and Standards

ana.krahmer@unt.edu

What is TDNP?About us

ana.krahmer@unt.edu

What is TDNP?

• Dedicated to preserving Texas newspapers, from any time or place, for any title.• Thus far, we host over 2 million pages of newspapers, dating from

1829 to present.

ana.krahmer@unt.edu

Workflow

ana.krahmer@unt.edu

Current PDF NewspapersPreservation on The Portal to Texas History

ana.krahmer@unt.edu

Current PDF Newspapers

• Began working with PDFs in 2010.• Since have added a total of 13 additional titles.• Earliest PDF issue is from 18 March 1998 (University of Dallas). • PDFs are acquired from publishers.• Permissions granted by publishers.

ana.krahmer@unt.edu

ana.krahmer@unt.edu

Current PDF Newspapers

Current PDF Newspapers

• Option to embargo• Example: Cherokeean Herald

ana.krahmer@unt.edu

Texas Press Association ArchivePreserving Recent Texas History

ana.krahmer@unt.edu

TPA Archive Partnership

• Collaboration with the Texas Press Association and NewzGroup out of Missouri. • 2TB of PDF newspapers, embargoed until publishers grant

permission.• Currently contacting publishers across Texas for embargo terms and

online rights.

ana.krahmer@unt.edu

Collaboration with NewzGroup

• Preserving current Texas NewzGroup PDFs, all under embargo.• We have the capability to open or hide issues at publisher’s request.

ana.krahmer@unt.edu

ana.krahmer@unt.edu

Technology and StandardsFile types, software, and metadata

ana.krahmer@unt.edu

Filetypes

• The PDF print master is the preservation copy.• Save this into JPG format at 400 dpi, from which derivatives are

created.

ana.krahmer@unt.edu

Software

• Adobe Acrobat• Batch renaming application• Python scripts• Microservices

ana.krahmer@unt.edu

Metadata

• In-house system based on qualified Dublin Core metadata elements.• Minor differences: bag-info files (BagIt) for pdfs contain the following

information (red text is unique to PDF materials). Source-Organization: University of North Texas LibrariesOrganization-Address: P. O. Box 305190, Denton, TX 76203-5190Contact-Name: Mark PhillipsContact-Phone: 940-565-2415Contact-Email: mark.phillips@unt.eduExternal-Description: Newspaper issues of the “NEWSPAPER NAME HERE" published in [ CITY], Texas. Issues were made available from born-digital PDF printmasters. Partner institution is the [partner library here]. Master files were PDF printmasters from which derivative JPGs were created.

ana.krahmer@unt.edu

Questions?

ana.krahmer@unt.edu

Thank you!

Contact usEmail: ana.krahmer@unt.eduPhone: 940-565-3367Website: http://tdnp.unt.edu

ana.krahmer@unt.edu

top related