shared digital access and preservation strategies for serials at the center for research libraries

11
This article was downloaded by: [Stony Brook University] On: 26 October 2014, At: 12:42 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The Serials Librarian: From the Printed Page to the Digital Age Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wser20 Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries Bernard F. Reilly a & James Simon a a Center for Research Libraries , Chicago, Illinois, USA Published online: 28 Sep 2010. To cite this article: Bernard F. Reilly & James Simon (2010) Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries, The Serials Librarian: From the Printed Page to the Digital Age, 59:3-4, 271-280, DOI: 10.1080/03615261003619060 To link to this article: http://dx.doi.org/10.1080/03615261003619060 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: james

Post on 28-Feb-2017

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

This article was downloaded by: [Stony Brook University]On: 26 October 2014, At: 12:42Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The Serials Librarian: From the PrintedPage to the Digital AgePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/wser20

Shared Digital Access and PreservationStrategies for Serials at the Center forResearch LibrariesBernard F. Reilly a & James Simon aa Center for Research Libraries , Chicago, Illinois, USAPublished online: 28 Sep 2010.

To cite this article: Bernard F. Reilly & James Simon (2010) Shared Digital Access and PreservationStrategies for Serials at the Center for Research Libraries, The Serials Librarian: From the PrintedPage to the Digital Age, 59:3-4, 271-280, DOI: 10.1080/03615261003619060

To link to this article: http://dx.doi.org/10.1080/03615261003619060

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

The Serials Librarian, 59:271–280, 2010Copyright © Taylor & Francis Group, LLCISSN: 0361-526X print/1541-1095 onlineDOI: 10.1080/03615261003619060

SERIALS COLLECTION MANAGEMENT INRECESSIONARY TIMES: PART 2

Edited by Karen Lawson

Shared Digital Access and PreservationStrategies for Serials at the Center

for Research Libraries

BERNARD F. REILLY and JAMES SIMONCenter for Research Libraries, Chicago, Illinois, USA

The Center for Research Libraries and its members have engagedin several strategic programs in recent years to provide persistentaccess to electronic versions of its historical collections, particularlyserials. One of these programs is the World Newspaper Archive, apartnership of CRL member institutions and Readex, a division ofNewsBank. Through the World Newspaper Archive program, CRLprovides its members robust access to historical newspaper content,federates the costs of this large-scale undertaking, and provides forcommunity control over content, archiving provisions, and futurecollections.

KEYWORDS newspaper digitization, Latin American news-papers, collaborative digitization

INTRODUCTION

The World Newspaper Archive (WNA), a collaborative effort of the Centerfor Research Libraries and its partner institutions, preserves and providespersistent electronic access to historical newspapers from around the globe.The effort grew out of increased interest among electronic publishers in digi-tizing newspaper collections from world regions held by major U.S. researchlibraries. Financial and in-kind support from CRL member institutions helpedlaunch the program in 2008.

Address correspondence to Bernard F. Reilly, President, Center for Research Libraries,6050 S. Kenwood Ave., Chicago, IL 60637-2885, USA. E-mail: [email protected]

271

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 3: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

272 B. F. Reilly and J. Simon

BACKGROUND

Through a century of sustained investment in acquisition, documentation,and preservation, North American research libraries have amassed a largeand valuable corpus of newspapers from all regions of the world. Thoselibraries’ aggregate holdings of newspapers in paper and micro-formats con-stitute a body of historical and cultural evidence spanning four centuries,which is not, and could not be, replicated elsewhere.

While preservation of global newspapers on microform has enjoyedsupport in the United States and elsewhere, until recently few effortsexisted to convert these resources to electronic format. The National DigitalNewspaper Program, the flagship effort in the United States, addresses onlyU.S. newspaper content and is still early in its implementation. Institutionsin Europe and other developed nations have begun efforts to convert theirown historical contents, but the scale of these efforts is daunting.

In 2005 several major U.S. research libraries reported being approachedby electronic publishers proposing to digitize their holdings of newspapersfrom Latin America, sub-Saharan Africa, the Middle East, and other worldregions. A number of those libraries chose to explore the possibility oforganizing collectively rather than making individual arrangements with thepublishers. These libraries believed that by acting together, they could exerta stronger hand in dealing with the aggregators, and that the benefits of theundertaking might also accrue to the broader library community.

The CRL was asked by its members to organize this collective effort.CRL has traditionally focused its cooperative preservation efforts on worldregions with less robust library and preservation infrastructure. Since 1956CRL has been microfilming news titles from Africa, Eastern Europe, LatinAmerica, the Middle East, and South and Southeast Asia. In addition, CRLhas served as the organizational umbrella for more than a dozen cooperativecollecting and preservation programs, providing communications, logisticaland financial management, and operational support.

In 2006, CRL developed and issued a request for information (RFI) fora World Newspaper Archive, outlining the general goals and parametersof the digitization effort for publishers. CRL framed the effort not merelyas a digitization project, but as a means to ensure persistent digital accessto an extensive and important body of primary source materials under theaegis of the research libraries community. Encouraged by the response fromthe publishing community, CRL then issued a formal request for proposals(RFP) to identify the optimum partner for the program. NewsBank Inc.,parent company of Readex, was chosen as the organization that broughtthe greatest combination of past performance, technical capabilities, andcontent-area priorities to the endeavor.

In June 2008, the World Newspaper Archive effort was launched toundertake the systematic, large-scale digitization of world newspapers and

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 4: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

Shared Digital Access and Preservation Strategies for Serials 273

news-related materials held by CRL and its participating libraries. The WNAwill be an ongoing, multiyear, and multistage endeavor wherein CRL andaffiliates combine expertise and resources to digitize and make availablefor scholarly use newspaper holdings from various world regions. The firstphase of the effort involves content from Latin America, starting initiallywith material in the public domain and extending the effort over a numberof phases. Details of the effort are currently found on the CRL website at:www.crl.edu/collaborative-digitization/world-newspaper-archive

GOALS

The World Newspaper Archive project has three major goals:

1. Community Access. Readex, a division of NewsBank, will provide elec-tronic access to back files of newspapers from microform and paperholdings of CRL and several major newspaper repositories. Contributingrepositories for the first phase include Harvard University, the New YorkPublic Library, University of Florida, University of Illinois at Urbana-Champaign, University of Texas at Austin, University of Washington,University of California, Berkeley, and University of California, LosAngeles. The World Newspaper Archive will employ the robust andreliable search and discovery platform used by Readex’s major news-paper databases: Early American Newspapers and Hispanic AmericanNewspapers. The annual access fee will be nominal for CRL members andpartner libraries that have contributed support and/or content to createthe collection.

2. Persistence. CRL will ensure the long-term persistence and continuedfunctionality of the news content for its community, as well as CRL mem-ber control over the future costs and quality of that access. CRL will retainmicroform copies of all newspapers in the WNA, and Readex will providefor the archiving of the digital files generated by the project in a mannerapproved by CRL. Moreover, the process of locating and preparing thesematerials for digitization is generating valuable preservation metadata andinformation about existing holdings of these rare materials. CRL will makethis “last-copy” information available to guide library decisions about pre-serving and digitizing local holdings. The project is also subsidizing thecost of replacing lost, damaged, or deteriorated microform copies of thenewspapers.

3. Growth. NewsBank shares with CRL a strong commitment to identify-ing and preserving primary source materials for international studies andresearch worldwide. This new working relationship will give the CRLlibrary community a voice in the digital conversion of the news collectionsthey have preserved for the past century in print and microform.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 5: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

274 B. F. Reilly and J. Simon

We also hope, with NewsBank, to make available to the CRL community agrowing collection of electronic news content on favorable terms.

LATIN AMERICAN NEWSPAPERS—THE FIRST MODULE OF WNA(FIGURE 1)

The WNA selected Latin America as the first area for collaboration,after broad consultation with the CRL membership, advisory groups, andcontributing partners.

Latin American newspapers often have long and unbroken publishinghistories, dating back to the mid-19th century or earlier, allowing for extensiveexamination of events over time. The Latin American press has frequentlyplayed an important historic role in the education of society and the evolutionof democratic thinking. In Brazil, for example, the press was instrumental inthe development of political thought in the formation of the nation-state,the abolition of slavery, and the move toward independence. Similarly, inMexico, the press helped shape public opinion during the Mexican War ofIndependence from Spain. In other countries with fewer presses and lowerliteracy rates (such as Peru and Chile), the primacy of the popular press

FIGURE 1 World Newspaper Archive: Latin American Newspapers title page.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 6: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

Shared Digital Access and Preservation Strategies for Serials 275

occurred after independence but still played an important role: the broaddistribution of information formerly conveyed by word of mouth.

Since the 1930s, considerable effort has been made by libraries andarchives to identify and preserve Latin America’s most important news sources.Harvard University’s early microfilm efforts included a number of LatinAmerican newspaper titles from such countries as Argentina (La Capital, LaNación, and La Prensa), Brazil (Jornal do Commercio), Colombia (El Tiempo),Peru (Comercio), Mexico (El Universal), and others. The Pan American Union(predecessor of the General Secretariat of the Organization of American States)also microfilmed more than a dozen titles from a variety of countries.

In 1956, the ForeignNewspaperMicrofilmProject (FNMP)was establishedto provide worldwide coverage of representative foreign titles, with CRL asits administrative home. This collaborative program subsumed the earliermentioned efforts of Harvard and the Pan American Union and extended therange of titles from Latin America. The FNMP continues to film current titlesfrom numerous countries, available for purchase or loan through CRL.

While organizations such as FNMP and the Library of Congress focusprimarily on currently published titles, CRL’s Latin American MicroformProject (LAMP) has from its inception pursued historical back files of news-papers. Some of the earliest LAMP projects included the original filmingof Siempre from Mexico City (1953–87); the West Coast Leader (1912–40)from Lima, Peru; and the Buckley collection of newspaper clippings onrevolutionary Mexico, held at the University of Texas at Austin.

Another significant source of content is the International Coalition onNewspapers (ICON), a multi-institutional cooperative effort to increase theavailability of international newspaper collections by improving both bib-liographic and physical access to these resources, and to preserve globalcultural heritage through the preservation of international newspaper col-lections held in the United States and abroad. Over the past decade, ICONhas preserved more than 840 reels and 45 newspaper titles from a variety ofregions. ICON’s Latin American representation thus far has included news-papers from Argentina, Colombia, Costa Rica, Mexico, Peru, and Venezuela,totaling 14 titles where significant gaps in preservation had been identified.

Conversion of historical holdings has received significant support fromgovernmental organizations such as the Department of Education and theNational Endowment for the Humanities (NEH). Major grants have allowedfor special collections of Latin American materials to be preserved, suchas “Revolutionary Mexico in Newspapers, 1900–1929” and “IndependentMexico in Newspapers from the 19th Century,” both filmed by the Universityof Texas at Austin. Together, these two projects preserved runs of morethan 900 titles held by the Benson Collection and other North Americanlibraries.

Numerous other institutions have pursued the historical record of LatinAmerican newspapers. Harvard University and New York Public Library

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 7: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

276 B. F. Reilly and J. Simon

have supported preservation efforts with significant institutional funding.The University of Florida Libraries has been collecting Latin Americanresearch resources since the late 1920s. Its Latin American Collection, for-mally established in 1967, includes more than 7,000 reels of newspaperholdings. The collection’s archive of master microfilm negatives has in somecases become the archive-of-record, as source originals were lost to fire,hurricane, war, and climate conditions. Other institutions including (but notlimited to) University of California, Berkeley, University of New Mexico, andPrinceton, Stanford, Tulane, and Yale Universities have picked up variousback file holdings of Latin American content over time.

Content Selection

It is from this collective body of material that the World Newspaper Archivecan draw in its selection of material for conversion. For Latin AmericanNewspapers, participating institutions submitted lists of available content toCRL for sorting and collation. CRL used a variety of tools for the identificationand prioritization of content. These included the ICON database of interna-tional newspapers (http://icon.crl.edu); holdings information contained in thehistorical publication Latin American Newspapers in United States Libraries:A Union List, compiled by Steven M. Charno of the Serials Division, Library ofCongress; OCLC’s WorldCat database; and other print and online resources.

By agreement from the affiliated organizations, the WNA chose to beginits work with content currently in the public domain (generally speaking,content produced prior to 1923). CRL created a detailed spreadsheet ofcontent that matched the general parameters set out by the advisory forconsideration by an expert panel of selectors.

The selection panel included newspaper librarians, collection devel-opment officers, library directors, and Latin American subject specialists.Institutions were asked to poll faculty members to ascertain local prioritiesand broader interest among the community. Finally, the information wasshared more generally with representatives of CRL and the Latin AmericanMicroform Project to ensure community support for the content selected.

The selection committee was asked to consider a variety of factors,including

● Historical or scholarly value● Broad geographic and temporal coverage● Balance and diversity of opinion● Inclusion of lesser-known materials along with well-established titles

During discussions, selectors expressed the following additional considera-tions:

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 8: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

Shared Digital Access and Preservation Strategies for Serials 277

● Attention paid to current and emerging research interests, focusing onparticular regional strengths

● Inclusion of at least one long-running national newspaper from as manycountries as possible, making as complete a run through fill-ins wherepractical

● More narrowly defined parameters for selection (such as “World War Iera”) to allow for greater depth of study

Finally, selectors expressed the general desire to be cognizant of inherent“bias” in WNA collections based on materials readily available, rather thanseeking opportunities for the best possible content. Part of this may includematerial not yet filmed, although it was recognized that hard copy may bemore expensive to process.

Production and Display

Following the approval of a content list, CRL staff began preparations forproduction. This included detailed scoping of holdings of each and everytitle to ensure as complete a run as possible; notification to all partnerinstitutions of their anticipated contributions and establishment of a col-laboration framework; and detailed discussions with Readex to develop aproduction schedule and come to agreement on technical specifications. CRLand Readex engaged in content sampling and assessment of files, metadata,and optical character recognition (OCR) prior to full implementation.

For the first phase, CRL and Readex agreed to a production scheduleentailing conversion of approximately 70,000 pages per month. Materialswould be made available on Readex’s platform on a rolling basis, with an initialproduction release date of December 2008 and completion date of fall 2009.On completion, Latin American Newspapers would include approximately35 fully searchable newspapers, with a projected 975,000 pages of content.

As content comes from a variety of libraries, CRL coordinated thearrangements for provision of materials to Readex. Content partners wouldprovide access to negative microfilm, if available, which would be scannedwith a minimum resolution of 400 dpi, bitonal or grayscale, with TIFF out-put. Readex would process the OCR and files and mount them on the samecontent platform used for America’s Historical Newspapers and other Readexnews products.

The result, first made available in December 2008, include a varietyof navigation and display features, including the ability to browse titles bydate and to limit searches by title, language, date, or place of publication(Figure 2).

The product features allow full-text searching of every title, with a page-image preview to assist in narrowing down selection (Figure 3).

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 9: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

278 B. F. Reilly and J. Simon

FIGURE 2 Date browse feature (pictured: La Prensa [Buenos Aires, Argentina]).

FIGURE 3 Full-text search feature (pictured: search for “Estados Unidos”).

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 10: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

Shared Digital Access and Preservation Strategies for Serials 279

FIGURE 4 Page view (pictured: El País [Mexico City, Mexico], June 16, 1913).

By selecting a result, users can view dynamically generated JPEG imagesof the article or page, and can navigate within a page or browse from pageto page or from one search result to the next. Users can also download PDFversions of each page or an entire issue for further research (Figure 4).

FUTURE PHASES

At the time of this writing, the Latin American Newspapers module is nearlycomplete, and CRL members and nonmembers can access nearly one millionpages of content. It is predicted that due to the identification of additionalyears and holdings, the module may extend to more than 1.3 million pagesby its completion.

CRL and Readex are already in production for the next module of theWNA, which will include up to 40 titles of African Newspapers from 1800 to1922. CRL and its partner institutions expect to add additional collections tothe WNA over the next several months, including South Asian Newspapersand newspapers from the Slavic region and Eastern Europe. As before, CRLand its advisors will select the content of these new collections from theinternational newspapers collected and preserved in paper and microformby the community of participating member libraries.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4

Page 11: Shared Digital Access and Preservation Strategies for Serials at the Center for Research Libraries

280 B. F. Reilly and J. Simon

Such an ambitious initiative, featuring a massive amount of data fromseveral world regions, requires an ongoing, multistage commitment by CRLand affiliated libraries to combine expertise and resources, to digitize andmake available holdings of newspapers. Nonetheless, CRL is committed toextending the project to different world regions and is in the planning stageof prioritizing the next crucial phases.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 1

2:42

26

Oct

ober

201

4