hathi trust a shared digital repository unpacking hathitrusts new cost model jeremy york project...

40
HATHI TRUST A Shared Digital Repository Unpacking HathiTrust’s New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Upload: danielle-bond

Post on 27-Mar-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

HATHI TRUST A Shared Digital Repository

Unpacking HathiTrust’s New Cost Model

Jeremy YorkProject Librarian, HathiTrust

SUNYJuly 15, 2011

Page 2: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

About

Page 3: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

PartnershipArizona State UniversityBoston UniversityBaylor UniversityCalifornia Digital LibraryColumbia UniversityCornell UniversityDartmouth CollegeDuke UniversityEmory UniversityHarvard University LibraryIndiana UniversityJohns Hopkins UniversityLafayette CollegeLibrary of CongressMassachusetts Institute of

TechnologyMichigan State UniversityNew York UniversityNew York Public LibraryNorth Carolina Central

University

North Carolina State UniversityNorthwestern UniversityThe Ohio State UniversityThe Pennsylvania State

UniversityPrinceton UniversityPurdue UniversityStanford UniversityTexas A&M UniversityUniversidad Complutense de

MadridUniversity of California

BerkeleyDavisIrvineLos AngelesMercedRiversideSan DiegoSan FranciscoSanta BarbaraSanta Cruz

The University of ChicagoUniversity of FloridaUniversity of IllinoisUniversity of Illinois at ChicagoThe University of IowaUniversity of MarylandUniversity of MichiganUniversity of MinnesotaThe University of North Carolina at Chapel HillUniversity of Notre DameUniversity of PennsylvaniaUniversity of PittsburghUniversity of UtahUniversity of VirginiaUniversity of WashingtonUniversity of Wisconsin-MadisonUtah State UniversityYale University Library

Page 4: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Digital Repository

• Launched 2008• Initial focus on digitized book and journal

content• “Light” archive

– As accessible as possible within the bounds of law

Page 5: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Statistics

• 8,980,200 volumes• 4,679,248 book titles• 214,155 serial titles• 2,450,522 “public domain”

Page 6: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

The Name

• The meaning behind the name– Hathi (hah-tee)--Hindi for elephant– Big, strong– Never forgets, wise– Secure– Trustworthy

Page 7: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Mission

• To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge

Page 8: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Goals

• Comprehensive collection• Preservation…with Access• Shared strategies

– Collection management, development– Preservation– Copyright– Efficient user services

• Openness

Page 9: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Governance

Page 10: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Governance

HathiTrustHathiTrust

Executive Committee

Strategic Advisory Board

Strategic Advisory BoardBudget/Finances

Decision-making

Guidance on Policy, Planning

Page 11: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Executive Committee

• Paul Courant, University Librarian and Dean of Libraries, UM• Laine Farley, Executive Director, CDL• John King, Vice Provost for Academic Information, UM• Paula Kaufman, University Librarian and Dean of Libraries, UI• Brian Schottlaender, University Librarian, UCSD• Ed Van Gemert, Deputy Director of Libraries, UW – Madison

(ex officio)• Brenda Johnson, Dean of Libraries, IU• Brad Wheeler, Chief Information Officer, IU• John Wilkin, Executive Director of HathiTrust and

Associate University Librarian, LIT, UM

Page 12: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Strategic Advisory Board• Ed Van Gemert (Chair), Deputy Director of Libraries,

University of Wisconsin - Madison• John Butler, AUL for Information Technology, University of

Minnesota• Patricia Cruse, Director, Preservation, CDL• Todd Grappone, AUL for Digital Initiatives & IT, UCLA• Julia Kochi, Director, Digital Library and Collections, UC San

Francisco• Sarah Pritchard, University Librarian, Northwestern University• Paul Soderdahl, Director, LIT, University of Iowa• John Wilkin, Executive Director, HathiTrust (ex officio)• Robert Wolven, Columbia University

Page 13: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Constitutional Convention

• October 2011• Delegates from each institution and

consortium– Carry certain number of votes determined

according to formula approved by Executive Committee

• 3-year review• Proposals

– Print management– Ballot proposals

Page 14: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Partnership

Page 15: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Partnership

• Who can become a partner?– Institutions worldwide– Libraries with print holdings

Page 16: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

What are the benefits? (1)

• Cost-effective long-term preservation and access services for digitized content– Commitments on digital content facilitate decisions about

digitization efforts and print collection management• For those with content, immediately offering long-term

preservation, bibliographic and full-text search, collection-building

• With content or not, full viewing and downloading capabilities for public domain materials and materials for which we have received permissions

Page 17: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

What are the benefits? (2)

• Specialized access to public domain and in-copyright materials for users with print disabilities

• Other lawful uses of in copyright materials such as Section 108 uses (print replacement copies, digital access to applicable works), access to orphan works

• HathiTrust encourages participation in initiatives and resources geared toward– Shared collection development and management (e.g., copyright

review work, print holdings database, de-duplication, collaboration with other organizations and initiatives)

– Participation in governance and collaborative initiatives– Defining future directions of the shared library.

Page 18: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

What’s involved?

• Contract– Sustaining– Content-Contributing

• Yearly fees• Commitment

– 5-year periods

• Shibboleth• Print Holdings

Page 19: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Costs

• Base funding from partner institutions• Basic infrastructure costs• Commitments in 5-year periods

Page 20: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How much does it cost? (1)

Page 21: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How much does it cost? (2)

• $0.149/volume/year for Google-digitized• $0.489/volume/year for IA-digitized• $0.154/volume/year for all content

• $3.40 per GB

Page 22: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Financial contributions of partners

HathiTrust Functional Framework

Page 23: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How does it work? (1)

• Sustaining membership is base– Pricing model for all partners beginning 2013– Based on overlap of HathiTrust volumes with

institutions’ print holdings– Share in infrastructure costs for public domain

volumes: • (PD*C*X)/N

– Share in infrastructure costs for in copyright volumes based on holdings• For a given in copyright volume:• IC=(C*X)/H

Page 24: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How does it work? (2)

• Main factors in costs are– Amount of content– Number of partners– Also a flexible multiplier designed to pay for

programmatic activities

• Tend to result in lower costs and more benefits over time

Page 25: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Example

• Factors– 1,000,000 PD volumes– 3,000,000 IC volumes– $0.154 per volume– 60 partners– Assume on average 12 institutions hold IC volumes

• Costs– PD = (1,000,000 * .154 * 1.5) / 60 = $3,850– IC = (3,000,000 * .154 * 1.5) / 12 = $57,750– Total = $61,600

Page 26: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How does it work? (3)

• In order to support these calculations– Need print holdings database (2013)– Update mechanisms– Manual remediation

• Analysis will also support– Expansion of legal uses of materials, to users who

have print disabilities, to orphan works– Facilitate collaborative collection development

and management operations– Will also benefit efforts in de-duplication

Page 27: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Print Holdings Database

• Volumes institutions own or have owned– Only print volumes (not microform, etc.)– OCLC number [required]– Bib record ID [required]– Condition (e.g., brittle) [optional]– Holding Status (e.g., current holding, withdrawn,

missing, etc.) [optional]

Page 28: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Percent Overlap

Average = 37.4%

Page 29: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Questions

• Why not get the information from OCLC?• Is it necessary to declare all volumes held, or

could an institution choose not to declare some?

• Are the print holdings data currently provided by institutions taken as an indication of the volumes institutions are declaring they have access to?

Page 30: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

What are we doing currently?

• Basing yearly fees on estimates– Based on infrastructure costs of anticipated

content– Estimated partnership growth – Institution total volume counts

Page 31: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

SUNY Costs

• SUNY University Centers– Albany, Binghamton, Buffalo, Stony Brook, Update

and Downstate Medical Libraries– 11,049,952 volumes

• All SUNY (based on 16,000,000 titles)– 27 institutions total– 20,800,000 volumes

Page 32: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

SUNY costs (2)

• Estimate using– 9,500,000 volumes at end of 2011– 60 partners (for University Centers and Medical

libraries)– 87 partners (for all SUNY libraries)– Multiplier of 1.5

Page 33: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

SUNY costs (3)

• University Centers– Public Domain

• Total PD cost * 1.5 / #partners * 6 = $70,903.22

– In Copyright• % of holdings (partner holdings / total holdings) * Total

IC cost * 1.5 = $67,635.06

– Total = $138,538.28• Prorated from August 1 = $58,072.21

Page 34: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

SUNY costs (3)

• All SUNY– Public Domain

• Total PD cost * 1.5 / 87 * 27 = $220,044.49

– In Copyright• % holdings (partner holdings / total holdings) * Total IC

cost * 1.5 = $127,198.61

– Total = $347,243.09• Prorated from August 1 = $145,556.69

Page 35: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Sustaining v. Content-Contributing

• Does not exclude contribution of content• If contribute content, costs covered up to

amount that would be paid as Sustaining partner– Barring additional costs that might be needed to

accommodate content (e.g., specialized load routines, generation of OCR)

• Above that, pay per-GB cost ($3.40)

Page 36: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Summary

• Partners share in costs of sustaining common resource

• Share in uses of relevant materials• Voice in future directions • Costs to institutions go down• Quality of services increases

– Realize in aggregated collection, something don’t get through distributed search or federation

• Free riders?

Page 37: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Changing Library Landscape

• Rapidly changing landscape• Libraries are making these decisions but they

are more and more collective decisions• We cannot afford anymore to do work

separately that could be done collaboratively

Page 38: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

HathiTrust overall benefits to libraries

• Digital Curation– Drive costs down– Reduce “bibliographic indeterminacy”– Make meaningful decisions about formats and quality– Increase discoverability, use– Consolidate development talent– Improve strength of archiving

• Print Curation– Means to associate our print holdings– Coordinated record-keeping

• Subsidiary benefits– Quantify problems– Collective attention to solving shared problems

Page 39: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

How to find out more• Web site “About” section:

http://www.hathitrust.org/about• Twitter: http://twitter.com/hathitrust• Monthly newsletter:

http://www.hathitrust.org/updates• RSS: http://www.hathitrust.org/updates_rss

• Contact us: [email protected]

Page 40: HATHI TRUST A Shared Digital Repository Unpacking HathiTrusts New Cost Model Jeremy York Project Librarian, HathiTrust SUNY July 15, 2011

Thank you very much!