much ado about everything: data, publications, and the role of repositories rebecca kennison center...

20
Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Upload: darrell-barrett

Post on 26-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Much Ado about Everything:Data, Publications,

and the Role of Repositories

Rebecca Kennison

Center for Digital Research and Scholarship

Columbia University

Page 2: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

What is a research repository?

An online repository holding “a complete version of the work and all supplemental materials, including a copy of the permission[s] …, in an appropriate standard electronic format … using suitable technical standards …, that is supported and maintained by an academic institution, scholarly society, government agency, or other well-established organization that seeks to enable open access, unrestricted distribution, interoperability, and long-term archiving.”

— Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities (2003)

Page 3: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

What is an institutional repository?

More specific: Output from single institution

More general: Inclusion of entire output of the enterprise (including administrative material)

Page 4: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Focus of repository strategy to date

Research paper (whether preprint or final published version), with “supplementary (or supporting) materials.”

Page 5: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Example of content distribution

OAIster search: psycholog* in title

Total: 19,047

Text: 13,733Images: 93Audio: 5Video: 48Dataset: 1Unidentified: 5,167

Page 6: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University
Page 7: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University
Page 8: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

A different view

Publication is snapshot in time of ongoing research

Cost of publication is small part of total cost of research (e.g., data collection and data analysis) — perhaps as little as 1%

Much of intellectual and financial investment of institution is not in publications, but in other research outputs

Page 9: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Examples of research output

Archival materials (e.g., e-mail correspondence)

Computer executable code (e.g., simulations)

Databases Datasets Electronic portfolios Electronic theses and

dissertations Multimedia objects (e.g.,

PowerPoint presentations, audio, video, graphics, animations, CAD)

Online media (e.g., blogs, wikis, Web sites)

Photographs Podcasts, pubcasts,

postercasts Scientific visualizations of

datasets Software and tutorials Teaching materials and

learning objects Text files (e.g., spreadsheets,

document files, LaTeX, RTFs, PDFs)

Page 10: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

What is value provided by research repository?CollocationInteroperability

Consistent content models Harvestable metadata for inclusion in subject-

or region-oriented repositories

Archiving and ongoing access (even when soft money dries up)

Preservation and permanence

Page 11: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Why do researchers not participate?

What’s in it for me??

Page 12: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Faculty perception of research repositories More value for user or institution than for

depositor Lack of control over content

Limitations on content types Access to that content Reuse of the content

Allen, J. (2005) Interdisciplinary differences in attitudes towards deposit in institutional repositories. Masters, Department of Information and Communications, Manchester Metropolitan University (UK).

Foster, N. F. & Gibbons, S. (2005) Understanding faculty to improve content recruitment for institutional repositories. D-Lib Magazine 11(1). Retrieved from http://www.dlib.org/dlib/january05/foster/01foster.html

Page 13: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Why this perception?

Focus of institutional policies and scholarly communication discussions (e.g., Green OA) has been on deposit of traditional publications, rather than materials researchers are most concerned with sharing and preserving

Page 14: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Focus of repository

Reflect needs of research community (collaboration, data security and confidentiality, access, priority claims, visibility and impact, quality certification, archiving and preservation)

Advance scholarship through accumulation of content of importance to that community

Not be seen as merely solving problems of libraries or being trendy

Be part of cooperative partnerships in open and interoperable manner

Page 15: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

What’s in it for institutions?

Collection and preservation of complete output resulting from research costs, data as well as articles

Better understanding and assessment of that total research output

Increased global impact and “brand recognition” for the university

Accelerated knowledge and research efficiencies

Page 16: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Benefits of research repository

Choice of what to deposit and determination of access and reuse determined by researcher

Research data made available alongside published outputs based on that data

Publication (as in making public) may include negative results, incremental findings

Value of research can be based on quality of databases, datasets, and other outputs, not on publications alone

Data required by funders and journals to be made available or shared can be deposited in repository

Interoperable research repositories can provide for unexpected use and novel reuse

Impact can be tracked through robust metrics

Page 17: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Challenges for research repository

What counts as research output varies from discipline to discipline

Research data are much more difficult to ingest, to make accessible, to regularize, and to preserve for the long-term than are publications and thus require much more infrastructure

Interoperability and dynamic cross-linking of data with publications or related data are not yet well-developed technologies (e.g., resource maps)

Cooperation is needed among government agencies, publishers, societies, universities, departments, and researchers

Page 18: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Biggest challenge: Show me the $$$!

Staffing for customization of software, education and training, curation, and data migration

Storage: petabytes, if not exabytesNeed for long-term institutional

commitment and sustainable business models

Page 19: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

Some predictions

Research communities become even more diverse, more interdisciplinary, more geographically dispersed

What counts for tenure and promotion will change Blurring of lines between traditional and new forms of

communication continues Roles in and workflows for scholarly communication are

transformed Search engines become increasingly better at indexing

content of all types Semantic Web is leveraged in exciting new ways to

integrate data and literature (e.g., BioLit)

Page 20: Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University

The problem — and the solution

Thank you!