the big shift: managing research collections in the cloud

48
Constance Malpas Program Officer, OCLC Research The Big Shift: Managing Research Collections in the Cloud Annual Meeting 28 April 2011

Upload: constance-malpas

Post on 18-Nov-2014

271 views

Category:

Documents


0 download

DESCRIPTION

Keynote presentation from ASERL annual meeting, Vanderbilt University, April 2011.

TRANSCRIPT

Page 1: The Big Shift: Managing Research Collections in the Cloud

Constance MalpasProgram Officer, OCLC Research

The Big Shift: Managing Research Collections in the Cloud

The Big Shift: Managing Research Collections in the Cloud

Annual Meeting28 April 2011

Page 2: The Big Shift: Managing Research Collections in the Cloud

RoadmapRoadmap

• Think Big – sourcing and scaling, mega regions

• Emerging infrastructure – managing collections ‘in the cloud’

• Shared print service provision - opportunities, challenges

• ASERL in perspective – regional and system-wide context

Page 3: The Big Shift: Managing Research Collections in the Cloud

You are … where?You are … where?

http://www.creativeclass.com/whos_your_city/maps/#Mega-Regions_of_North_America

Page 4: The Big Shift: Managing Research Collections in the Cloud

A Master Plan for a mega regionA Master Plan for a mega region

“[Midwestern universities ] work together on both regional and national agendas, merging library and research resources, and sharing curricula and instructional resources with faculty and students. Aggregating these spires of excellence by linking these institutions gives the Midwest region many of the world’s leading programs in a broad range of key knowledge areas.” (p. 37)

“Sharing of library and researchfacilities can augment scholarly production and assure fuller use of cultural assets without great extra cost to the state.” (p. 37)

Page 5: The Big Shift: Managing Research Collections in the Cloud

Shared print is a prime example: a core operation that

is moving “outside” institutional boundaries

University of California Orbis Cascade WEST CIC TRLIN Hathi Print CAVAL, UKRR, JURA etc.

Boundary work and the library ‘service bundle’Boundary work and the library ‘service bundle’

Page 6: The Big Shift: Managing Research Collections in the Cloud

A ‘Big Shift’ in attention, resourcesA ‘Big Shift’ in attention, resources

Page 7: The Big Shift: Managing Research Collections in the Cloud

Shared Print: what’s the problem?Shared Print: what’s the problem?

Shift in scholarly attention from print to electronic means low-use retrospective print collections are perceived to deliver less library value

Competing demands for library space: teaching, learning, collaborative research vs. “warehouse of books”

Among academic libraries, a shrinking pool of institutions with mandate, capacity to support print preservation

As transaction costs for managing legacy print collections decrease, libraries will seek to externalize print operations to shared repositories

Page 8: The Big Shift: Managing Research Collections in the Cloud

Shared Print: OCLC ResearchShared Print: OCLC Research

Active portfolio of work since 2007:

• North American library storage capacity (2007)

• ~70M volumes in storage; cooperative models in the minority

• Policy requirements shared print repositories (2009)

• critical need: disclosure of print preservation commitments

• Leveraging infrastructure: MARC21 583 Action Note (2009/2011)

• copy-level retention, condition statements are required

• Cloud-sourcing research collections (2010)

• mass digitization of monographs accelerates shift to shared print

Page 9: The Big Shift: Managing Research Collections in the Cloud

Shared Print value proposition(s)Shared Print value proposition(s)

1) Ensures long-term survivability of ‘last copies’ and low-use print journals and books

Extension of traditional repository function; limited motivation to subsidize

2) Enables reduction in redundant inventory for moderately and widely-held titles, facilitating redirection of library resources toward more distinctive service portfolio

Strategic reserve provides a hedge against disruption in the marketplace, rapid fluctuations in scholarly value & function of print; provides tangible value to participant

Page 10: The Big Shift: Managing Research Collections in the Cloud

Growth of US library storage infrastructureGrowth of US library storage infrastructure

1982

1986

1987

1992

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

0

20,000,000

40,000,000

60,000,000

80,000,000

100,000,000

120,000,000

140,000,000

Built

Capaci

ty

in V

olu

me E

quiv

ale

nts

(2007)

Aggregate off-site capacity has increased exponentially

+ 70 million volumes in storage (2007)

Derived from L. Payne (OCLC, 2007)

2 high-density facilities

68 high-density facilities

Date of Original Construction

Page 11: The Big Shift: Managing Research Collections in the Cloud

Aggregate preservation resource: a black box?Aggregate preservation resource: a black box?

Of 68 storage facilities identified in Payne (OCLC, 2007):

• 2 are visible in WorldCat today: UC NRLF & UC SRLF

• Proxies: CRL, LC?

Among 9 ASERL storage collections profiled in 2004:

• 80% of monographic titles held in a single storage facility

SRLF (ZAS)

NRLF (ZAP)

CRL AZ State (AZS)

UC Irvine (CUI)

Rutgers (NJR)

0%10%20%30%40%50%60%70%80%90%

100%

<25 libraries 25-99 libraries100-499 libraries >499 libraries

Titles in ‘shared print’ collections less widely held?

Less widely held

More widely held

Page 12: The Big Shift: Managing Research Collections in the Cloud

Projected growth of HathiTrust Digital LibraryProjected growth of HathiTrust Digital Library

June-09Dec

June-10Dec

June-11Dec

June-12

Decembe...

June-13

Decembe...

June-14

Decembe...

June-15

Decembe...

June-16

Decembe...

June-17

Decembe...

June-18

Decembe...

June-190

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

7,000,000

June 2010 - June 2020

Growth in volumes Growth in titles

*

Harvard University Library in constant 2008 volumes

* Library of Congress in constant 2008 volumes

OCLC Research. June 2010

Page 13: The Big Shift: Managing Research Collections in the Cloud

Premise of Cloud Library project (2009-2010)Premise of Cloud Library project (2009-2010)

Emergence of large scale shared print and digital

repositories creates an opportunity for strategic

externalization of traditional repository function

• Reduce total costs of preserving scholarly record

• Enable reallocation of institutional resources• Support renovation of library service portfolio• Create new business relationships among

libraries

A bridge strategy to guarantee access and preservation of long tail, low use collections

during ongoing p- to e- transition

Page 14: The Big Shift: Managing Research Collections in the Cloud

25 years

+70M vols.

0101010101010

1010101010101

0101010101010

1010101010101

0101010101010

15 months

+5M vols.

Shared infrastructure: books & bitsShared infrastructure: books & bits

Will this intersection create new operational efficiencies? For which libraries? Under what conditions? How soon and with what impact?

HathiTrust

Academic off-site storage

Page 15: The Big Shift: Managing Research Collections in the Cloud

0 20 40 60 80 100 1200%

10%

20%

30%

40%

50%

60%

Rank in 2008 ARL Investment Index

% o

f T

itle

s i

n L

oca

l C

oll

ecti

on

A global change in the library environmentA global change in the library environment

June 2010Median duplication: 31%

June 2009Median duplication: 19%

Academic print book collection already substantially duplicated in mass digitized book corpus (HathiTrust)

OCLC Research. June 2010

Page 16: The Big Shift: Managing Research Collections in the Cloud

A mirror of the academic print collectionA mirror of the academic print collection

Language, Linguistics & Literature

Unknown Classification

Philosophy & Religion

Engineering & Technology

Political Science

Sociology

Education

Physical Sciences

Medicine

Agriculture

Mathematics

Performing Arts

Psychology

Chemistry

Medicine By Body System

Health Facilities, Nursing

0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 1,000,000

Distribution of Titles in HathiTrust Digital Library by Subject and Copyright Status

(June 2010)

Public DomainIn Copyright

Titles / EditionsN = 3.64M titles

C. Malpas Cloud-sourcing Research Collections (OCLC, 2010)

A critical mass of retrospective literature in the humanities, social sciences

Page 17: The Big Shift: Managing Research Collections in the Cloud

An opportunity and a challengeAn opportunity and a challenge

>50% of titles are ‘widely held’

>80% of titles are in copyright

An opportunity to rationalize holdings, but…

library print supply chain will be needed for some time

OCLC Research. June 2010

Page 18: The Big Shift: Managing Research Collections in the Cloud

Mass-digitized books in print repositoriesMass-digitized books in print repositories

Sep-09 Oct-09 Nov-09 Dec-09 Jan-10 Feb-10 Mar-10 Apr-10 May-10 Jun-100

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

Mass digitized books in Hathi digital repository Mass digitized books in shared print repositories

Un

iqu

e T

itle

s

~75% of mass digitized corpus is ‘backed up’ in one or more shared print repositories

~3.5M titles

~2.5M

Page 19: The Big Shift: Managing Research Collections in the Cloud

PredictionPrediction

Within the next 5-10 years, focus of shared print archiving and service provision will shift to

monographic collections

• large scale service hubs will provide low-cost print management on a subscription basis;

• reducing local expenditure on print operations, releasing space for new uses and facilitating a redirection of library resources;

• enabling rationalization of aggregate print collection and renovation of library service portfolio

Mass digitization of retrospective print collections will drive this transition

Page 20: The Big Shift: Managing Research Collections in the Cloud

WHAT WILL IT TAKE?WHAT WILL IT TAKE?Shared print service provision . . .

Page 21: The Big Shift: Managing Research Collections in the Cloud

Shared Print provision: capacity variesShared Print provision: capacity varies

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of February 2011.% o

f H

ath

iTru

st

titl

es d

up

licate

d in

pri

nt

rep

osit

ory

Page 22: The Big Shift: Managing Research Collections in the Cloud

Shared print marketplace: who has the edge?Shared print marketplace: who has the edge?

C. Malpas Cloud-sourcing Research Collections (OCLC, 2010)

Page 23: The Big Shift: Managing Research Collections in the Cloud

Or, reconfigure resource to maximize value Or, reconfigure resource to maximize value

C. Malpas Cloud-sourcing Research Collections (OCLC, 2010)

Page 24: The Big Shift: Managing Research Collections in the Cloud

Management Perspective: How Much is Enough?Management Perspective: How Much is Enough?

Shared Print service must deliver• Space recovery equal to “one floor” at outset• Volume reduction equal to X years of print

acquisitions• Cost not to exceed current storage options• Minimize (visible) disruption in operations

If management of mass-digitized monographs could be externalized to large scale providers today: average space recovery of 20,000 ASF per ARL library cost avoidance of ~$1M for new storage module cost avoidance of $1M per year for on-site management

Page 25: The Big Shift: Managing Research Collections in the Cloud

Staff Perspective: What’s Good EnoughStaff Perspective: What’s Good Enough

Shared Print service provision must equal or exceed

• Turnaround/delivery from local storage (<2 days)• Local loan period • Local access/availability guarantee, ability to

recall etc• Discoverability of local resource

Local retention mandated when title held by <10 libraries

No one mentioned . . . Home delivery option direct to patron Acceptable loss rate repository viability Penalties for late return impact on other clients

Page 26: The Big Shift: Managing Research Collections in the Cloud

Implications: Shared PrintImplications: Shared Print

A small number of repositories may suffice for ‘global’ shared print provision of low-use monographs

Generic service offer is needed to achieve economies of scale, build network; uniform T&C

Fuller disclosure of storage collections is needed to judge capacity of current infrastructure, identify potential hubs

Service hubs will need to shape inventory to market needs; more widely duplicated, moderately used titles

If extant providers aren’t motivated to change service model, a new organization may be needed

Page 27: The Big Shift: Managing Research Collections in the Cloud

LOCAL CONTEXTLOCAL CONTEXTShared print in perspective . . .

Page 28: The Big Shift: Managing Research Collections in the Cloud

ASERL in system-wide contextASERL in system-wide context

~880 academic libraries in ASERL region (2008)

• represents 23% of all academic libraries in the US

• 134 (15%) support institutions offering doctoral programs

38 ASERL libraries provide backbone for academic institutions throughout the region

• Rich collections, robust infrastructure, reliable fulfillment

• ASERL holdings account for ~47% of regional academic collection

• Upholding print preservation mandate an increasing challenge

Page 29: The Big Shift: Managing Research Collections in the Cloud

Diversity of institutional mandates Diversity of institutional mandates

OCLC Research. Derived from U.S. Department of Education, National Center for Education Statistics, Academic Libraries Survey, 2008.

Least reliant on traditional

library infrastructure

Page 30: The Big Shift: Managing Research Collections in the Cloud

Circulation per FTE student is on a declineCirculation per FTE student is on a decline

OCLC Research. Derived from NCES Academic Libraries Surveys, 1992-2000.

Declining ROA?

Page 31: The Big Shift: Managing Research Collections in the Cloud

Same trend holds within ASERLSame trend holds within ASERL

2002 2003 2004 2005 2006 2007 2008 20090

5

10

15

20

25

Median Circulation Transactions per FTE Student

in ASERL Member Libraries

OCLC Research. Derived from ASERL Annual Statistics, 2002/2003 – 2009/2010.

-41%

Page 32: The Big Shift: Managing Research Collections in the Cloud

A long term, system-wide trendA long term, system-wide trend

19771982

19851988

19921995

19971998

20002002

20042006

2008$0

$50,000,000

$100,000,000

$150,000,000

$200,000,000

$250,000,000

$300,000,000

$350,000,000

$400,000,000

0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

US Academic Library Expenditures vs. Total Spending on Post-Secondary Education

Aggregate US Spending on Post-Secondary Education US Library Operating Exp. as % of Ed. Spending

$6.8 billion in 2008

OCLC Research. Derived from data reported in NCES Digest of Education Statistics: 2008.

Page 33: The Big Shift: Managing Research Collections in the Cloud

Higher Education funding cuts in 43 StatesHigher Education funding cuts in 43 States

Page 34: The Big Shift: Managing Research Collections in the Cloud

Institutional autonomy variesInstitutional autonomy varies

OCLC Research. Derived from U.S. Department of Education, National Center for Education Statistics, Academic Libraries Survey, 2008.

Modes of cooperation will vary … as will motivation to share

Page 35: The Big Shift: Managing Research Collections in the Cloud

Increasing privatization of Higher Education Increasing privatization of Higher Education

OCLC Research. Derived from U.S. Department of Education, National Center for Education Statistics, Academic Libraries Surveys, 2000-2008.

Page 36: The Big Shift: Managing Research Collections in the Cloud

Visible differences, hidden similarities Visible differences, hidden similarities

AAUMFM TM

AFX

GNGU

ERE

VPILR

UALM FD

AKUK

FUG

NDD

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

4,000,000

ASERL Member Holdings in WorldCat

Tit

les

AAUMFM TM

AFX

GNGU

ERE

VPILR

UALM FD

AKUK

FUG

NDD

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

ASERL Member Holdings Duplicated in HathiTrust

Tit

le O

verl

ap

(%

)

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

>56M holdings in aggregate

~34% of collective ASERL coll’n duplicated~2M unique (discrete) titles

Page 37: The Big Shift: Managing Research Collections in the Cloud

Median ASERL duplication in HathiTrust: 33%Median ASERL duplication in HathiTrust: 33%

0

500,

000

1,00

0,00

0

1,50

0,00

0

2,00

0,00

0

2,50

0,00

0

3,00

0,00

0

3,50

0,00

0

4,00

0,00

00%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Holdings in WorldCat

Tit

les D

up

licate

d

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

Tennessee: 41%

Florida: 27%

[Standard deviation: 3%]

OCLC Research. Derived from U.S. Department of Education, National Center for Education Statistics, Academic Libraries Survey, 2008.

Page 38: The Big Shift: Managing Research Collections in the Cloud

This edition held in print by more than 2,200 libraries . . .

including all 38 ASERL members

A total of 3 ILL requests since 2007

0 from (or to) ASERL members

Page 39: The Big Shift: Managing Research Collections in the Cloud

An example: the University of MiamiAn example: the University of Miami

Full ViewSearch Only

~1.2 million University of Miami (FQG) library holdings in WorldCat

393,877 (33%) duplicated in HathiTrust Digital Library

30,472 titles

363,405 titles

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

Page 40: The Big Shift: Managing Research Collections in the Cloud

Weighing risks and benefitsWeighing risks and benefits

12-

45-

9

10-2

4

25-4

9

50-7

4

75-9

9

100-

149

150-

199

200-

299

300-

399

400-

499

500-

599

600-

699

700-

799

800-

899

900-

999

1000

-149

9

1500

-180

0

2000

-249

9

2500

-0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

System-wide Print Distribution of University of Miami Titles Duplicated in HathiTrust Digital

Library

Search Only

Full View

Holding Libraries

Tit

les /

Edit

ions

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

77% of mass-digitized titles in Miami’s collection are held by >99 libraries … low risk but print supply chain still needed

96% of mass-digitized titles in Miami’s collection are held by >24 libraries

N = 393,877 titles

Page 41: The Big Shift: Managing Research Collections in the Cloud

Sizing up a potential shared print supplierSizing up a potential shared print supplier

FUG could supply

Represents at least 2.75 miles of library shelving @ Miami

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

232,827 titles

~1.2 million Miami (FQG) holdings

Page 42: The Big Shift: Managing Research Collections in the Cloud

Risk and opportunity profiles differRisk and opportunity profiles differ

>10 li-braries

0%

10 to 24 libraries

1%

25 to 99 libraries

9%

>99 li-

braries

90%

>10 libraries2% 10 to 24

libraries13%

25 to 99 libraries

34%

>99 li-

braries

51%

Locally held titles in mass-digitized corpus abundant in system-wide collection

HathiTrust undergirds stewardship mission, redistributes costs of curationOCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

N=1.16M titles

N=370K titles

Page 43: The Big Shift: Managing Research Collections in the Cloud

Stewardship & sustainability: a pragmatic viewStewardship & sustainability: a pragmatic view

Using recent life-cycle adjusted cost model* for library print collections,

$4.25 per volume per year -- on campus$ .86 per volume per year -– in high-density storage

East Carolina University is spending, at minimum, between

[373K titles * $.86 =] $320K to $1.6M [=373K titles * $4.25 ] annually

to retain local copies of content preserved in the HathiTrust Digital Library and widely-held in the ASERL communityThe library is not financially accountable for

these costs but it is responsible for managing them

*Paul Courant and M. “Buzzy” Nielson, “On the Cost of Keeping a Book” in The Idea of Order (CLIR, 2010)

Page 44: The Big Shift: Managing Research Collections in the Cloud

Where to turn?Where to turn?

• Existing cooperative network: UNC system

• UNC, NCSU & Duke are HathiTrust partners, participate in TRLN shared copy program – potential shared print suppliers?

~1.2 million ECU (ERE) holdings

Represents at least 4 miles of library shelving @ East Carolina

373,370 (32%) in HathiTrust Digital Library

Page 45: The Big Shift: Managing Research Collections in the Cloud

Sep-

09

Nov-0

9

Jan-

10

Mar-1

0

May-1

0

Jul-1

0

Sep-

10

Nov-1

0

Jan-

11

Mar-1

1

May-1

1

Jul-1

1

Sep-

11

Nov-1

1

Jan-

12

Mar-1

2

May-1

2

Jul-1

2

Sep-

12

Nov-1

2

Jan-

13

Mar-1

3

May-1

3

Jul-1

3

Sep-

130%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Private non-ARL Linear (Private non-ARL )Public non-ARL Linear (Public non-ARL)Public ARL Linear (Public ARL)Private ARL Linear (Private ARL)

Sep 2012Dec 2012

Sep 2013

Jun 2013

ASERL libraries: a common trajectory, different timelines

ASERL libraries: a common trajectory, different timelines

OCLC Research. Analysis based on HathiTrust and WorldCat snapshot data. Data current as of April 2011.

The next few years are critical

How can regional infrastructure be leveraged to support this change?

% o

f title

s du

plic

ated

in H

athi

Trus

t Dig

ital L

ibra

ry

Page 46: The Big Shift: Managing Research Collections in the Cloud

A closing thoughtA closing thought

If we don’t demonstrate a little backbone

developing shared print solutions

the future of legacy print could look likethis

Guillotined books en route to recycling station.

Page 47: The Big Shift: Managing Research Collections in the Cloud

Thanks for your attention.Thanks for your attention.

Comments, Questions?

Constance [email protected]

@ConstanceM

Page 48: The Big Shift: Managing Research Collections in the Cloud

For discussionFor discussion

• What criteria matter most in assessing potential shared print partners?

• Geographic proximity, institutional governance, scope of collection, delivery guarantee, etc?

• Is the economic integration of Southeastern mega-region(s) a factor to consider in shared print business planning?

• Are partnerships in zones of strong economic integration be likely to be more sustainable?

• How is the increasing privatization of higher education likely to affect regional shared print planning?

• Do private and charter universities have greater flexibility in externalizing print operations?