managing research data in academic institutions: role of ...eprints.rclis.org/24911/1/50.pdf · one...

12
- 484 - 10 th International CABLIBER 2015 Managing Research Data in Academic Institutions... Managing Research Data in Academic Institutions: Role of Libraries Mallikarjun Dora H Anil Kumar Abstract One of the global emerging trends in academic libraries is to facilitate the management of research data for the benefit of researchers and institutions. The purpose of this paper is to explore the role of a library in offering such research data management services. The paper discusses the importance of research data, its preservation, organization, dissemination and critical role in the scholarly research life cycle. The authors attempt to provide a vivid description of Research Data Management (RDM) as a service and in the process review the existing literature on the topic in addition to the indicating the tools and technologies that could be adopted in successful RDM service implementation. The paper also is an attempt to share the experience of creating the Vikram Sarabhai Library’s research data repository that was developed by adopting the open source software - CKAN. Keywords: Research Data Management, Research Data Service, CKAN, Research Data, Data Repository 1. Introduction Scholarly research is an important indicator of na- tional development and reflects the potential of a nation to harness its human resources to solving the problems of mankind. Global problems of health, education, poverty, etc. can be better understood and addressed effectively through research. Schol- arly research broadens the horizon of policy think- ing in addressing many critical issues that govern- ments face. In most of the cases, scholarly research has in its core, the need for data. EPSRC (Engineering and Physical Sciences Research Council) defines research data as “recorded factual material commonly re- tained by and accepted in the scientific community 10 th International CALIBER-2015 HP University and IIAS, Shimla, Himachal Pradesh, India March 12-14, 2015 © INFLIBNET Centre, Gandhinagar, Gujarat, India as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the for- mat in which it is created”. Research data can be a raw data directly produced from the lab or survey, or it can be processed data which has been cleaned, refined, arranged, and combined in a manner that it is useful in research. Research data also include data which are already published in the journal or in other scientific communication. Research data includes analogue sources as well as discrete digital objects (text, files, image, audio, video), complex digital ob- jects (discrete digital object made by combining a number of other digital objects) and databases (Whyte and Tedd, 2011). Researchers all over the world are generating large volumes of data sets for their research purpose. The advancements in information technology, availabil- ity of large number of electronic data sources and

Upload: others

Post on 07-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 484 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

Managing Research Data in Academic Institutions Role of Libraries

Mallikarjun Dora H Anil Kumar

Abstract

One of the global emerging trends in academic libraries is to facilitate the management of research data forthe benefit of researchers and institutions The purpose of this paper is to explore the role of a library inoffering such research data management services The paper discusses the importance of research data itspreservation organization dissemination and critical role in the scholarly research life cycle The authorsattempt to provide a vivid description of Research Data Management (RDM) as a service and in the processreview the existing literature on the topic in addition to the indicating the tools and technologies that couldbe adopted in successful RDM service implementation The paper also is an attempt to share the experienceof creating the Vikram Sarabhai Libraryrsquos research data repository that was developed by adopting the opensource software - CKAN

Keywords Research Data Management Research Data Service CKAN Research Data Data Repository

1 Introduction

Scholarly research is an important indicator of na-tional development and reflects the potential of anation to harness its human resources to solving theproblems of mankind Global problems of healtheducation poverty etc can be better understoodand addressed effectively through research Schol-arly research broadens the horizon of policy think-ing in addressing many critical issues that govern-ments face

In most of the cases scholarly research has in itscore the need for data EPSRC (Engineering andPhysical Sciences Research Council) defines researchdata as ldquorecorded factual material commonly re-tained by and accepted in the scientific community

10th International CALIBER-2015HP University and IIAS Shimla Himachal Pradesh IndiaMarch 12-14 2015copy INFLIBNET Centre Gandhinagar Gujarat India

as necessary to validate research findings althoughthe majority of such data is created in digital formatall research data is included irrespective of the for-mat in which it is createdrdquo Research data can be araw data directly produced from the lab or surveyor it can be processed data which has been cleanedrefined arranged and combined in a manner that itis useful in research Research data also include datawhich are already published in the journal or in otherscientific communication Research data includesanalogue sources as well as discrete digital objects(text files image audio video) complex digital ob-jects (discrete digital object made by combining anumber of other digital objects) and databases(Whyte and Tedd 2011)

Researchers all over the world are generating largevolumes of data sets for their research purpose Theadvancements in information technology availabil-ity of large number of electronic data sources and

- 485 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

powerful data analysis software have together fa-cilitated the researcher to generate and work withlarge data sets The challenge is in preserving anddisseminating these large data sets for future needsthat would be very valuable to the academic com-munity

At the individual researcher level they are not onlygenerating data but also attempting to preserve forreuse and sharing The researchers either save andmanage the data by preserving it with them or placethe data directly in the open repositories like figshare(wwwfigsharecom) or dryad (wwwdatadryadorg)

Journals publishers have started maintaining theirown data repositories to facilitate its authors to hostdata for preservation verification and managementResearch institutions are also initiating research datamanagement services to preserve and manage datafor future reuse and transparency in research Man-agement of research data has recently emerged as astrategic priority for the university (Pryor 2012)There are a number of universities scientific insti-tutions and governments that are thinking on simi-lar lines and have initiated the building their researchdata infrastructure and management (Pinfield Coxand Smith 2014)

The growing number in the registry of data reposi-tories validates the need for such repositories by theacademic and research fraternity There are about1000 research data repositories registered in eachre3data (httpwwwre3dataorg) and databib(httpdatabiborgindexphp) which are among themajor research data registries Simultaneously thepopularity of services like Figshare and Ddryadwhere individual researchers can keep hisher re-search data also indicate the potential value and needfor research data management

It is in this context that it may be worthwhile to re-view the role of libraries in providing research sup-port to its users Traditionally libraries have beenthe playing the role of a partner in research rightfrom the time a researcher initiates the process ofidentifying a research problem to the final publica-tion of a paper in a journal or presentation in a con-ference Library has a key role in organising man-aging preserving discovering and disseminating ofresearch to a wide and relevant audience Extendingits tradition role in providing research support thelibraries today have opportunity to facilitate man-agement of research data also

This paper explores the role of libraries and libraryprofessionals in Research Data Management (RDM)and also presents a case study of CKAN softwareand its implementation at Indian Institute of Man-agement Ahmedabad which can be used for creat-ing and preserving research data

2 Research Data Management

ldquoResearch data management concerns theorganisation of data from its entry into the researchcycle through to the dissemination and archiving ofvaluable results It aims to ensure reliable verifica-tion of results and permits new and innovative re-search built on existing informationrdquo (Whyte andTedds 2011) A service to manage research data inother words a Research Data Management Service(RDMS) would have to consider the entire life cycleof research data starting from the creation of datato its reuse The UK Data Service provides for aclear depiction of the research data life cycle as shownin Figure 1

- 486 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

Figure 1 Research Data Life Cycle

Image Source httpukdataserviceacukmanage-datalifecycleaspx

Cox and Pinfield (2014) describe research data man-agement service as consisting of different activitiesand processes that include creation storage secu-rity preservation retrieval sharing and re-use ofdata including technical capabilities ethical consid-eration legal issues and governance framework

There are many benefits an institution can derivethrough an effective research data management ser-vice Some of them include

Long term preservation of data provides forvalidation check in the future of the data andthis enhances the credibility and transparencyof the research data used

Research data can be reused by the same re-searchers or maybe even others who may liketo extend the use of such data for purposes un-seen by the initial researcher

Well managed research data can always be up-dated to enhance or extend the understandingof existing research on this data

It is economical to reuse the data leading to sav-ing the time and resources for an institutionand hence providing opportunities to investelsewhere

By opening such research data sets for the pub-lic the visibility of the host institution and itsresearchersmiddot

Research data management service enhancesthe discoverability of such data thereby facili-tating quality research

Designing and development of a research data man-agement service would have to consider many fac-tors including

Identifying the various stakeholders who wouldcontribute manage and use the service

Understanding the needs of these stakeholdersto design a sustainable and user friendly re-search data management service

Reviewing and adopting standards recom-mended in the various guidelines published forthis purpose

Review the need to either develop customisedsoftware to handle the requirements or adoptexisting commercial or open source systems

Study and review the IT infrastructure requiredfor successful implementation of a researchdata management service

Develop institution relevant guidelines and poli-cies for hosting sharing and reuse of researchdata by its researchers

Jones Prior and White (2013) have developed a use-ful working level guide for higher education institu-tions planning to create Research Data Management(RDM) services for their institutions The guide pro-

- 487 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

vides working description of why the RDM servicesare required what are the roles and responsibilitiesof the stakeholders the process of developing theservice and a detailed description of the various com-ponents of the RDM service The guide is also veryuseful for institutions planning to implement RDMservices as it provides relevant links to useful linksto a large body of training materials

It may be worthwhile to look at the experiences ofvarious institutions around the world that have beenoffering RDM services Akers Sferdan Nicholls andGreen (2014) review eight US universities classifiedas research universities engaged in lsquovery highrsquoresearch activity by Carnegie Foundation Theuniversities include Cornell Emory John HopkinsPennsylvania State Purdue Illinois at Urbana ndashchampaign Michigan and Virginia The authorsfound that despite their differences in approachingthe RDM implementation most of the institutionsface a common challenge in developing RDMsupport programs The other major issueshighlighted in the study include challenges inreaching out to the researchers to improve researchdata management practices and seeking funding fornew staff positions and infrastructure It is alsointeresting that in the case of all these universitiesrespective libraries played a prominent role in thedesign and development of RDM services in theirinstitutions

3 Libraries and Research Data Management

To further explore the role of libraries in RDM ser-vices being offered at research institutions it maybe interesting to take note a few interesting papersexisting on this topic Gold (2007) described thepotential role of libraries in managing data with afocus on social science data geo referenced data and

bioinformatics data Henty (2008) surveyed Austra-lian universities to identify the existing data man-agement practices and trends He also explored thepossible roles of libraries and librarians in this con-text Lewis (2010) examined in detail the roles andskills of university librarians in UK in the contextRDM and suggested upskilling of the existing libraryworkforce through education and training on re-search data management

One of the early surveys to study the preparationand attitude of librarians towards research data ser-vice was undertaken by Tenopir Sandusky Allardand Birch (2012) The survey was conducted among223 librarians of the Association of Research Librar-ies (ARL) and the findings indicated that althoughthere was very low percentage of libraries involvedin RDMS offerings the librarians believed that thiswas an important service for academic research li-braries to render Similar findings were reportedfrom the survey conducted by Corrall Kennan andAfzal (2013) among 140 libraries in Australia NewZealand Ireland and United Kingdom They alsofound that RDM service represents a relatively newdevelopment in library service offering though therewas an interest among the libraries to offer RDSwith a high proportion of libraries in the process ofplanning to offer RDM services support

Meanwhile Tenopir Sandusky Allard and Birch(2014) extended their previous study to survey andunderstand the perceptions of 223 library directorsin US and Canadian libraries towards RDM serviceThey found that RDMS was not frequently employedin libraries but there were academic and researchlibraries that were already offering RDM serviceswith more planning to initiate RDMS in the nextcouple of years There was a small but growingnumber of libraries that were becoming more in-

- 488 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

volved in RDM by helping with data managementplans and preparing and preserving research data

To examine the contribution of academic librariesto research data management (Pinfield Cox andSmith 2014) interviewed (semi structured) 26 librarystaff from different UK institutions The study foundthat though libraries were playing an important rolein RDM there was a lack of consistent support fromvarious stakeholders at the institution The studyalso identified various factors and issues that wereimportant for successful RDM service implementa-tion The study based on its findings proposed anew model for a RDM programme that could helpovercome barriers to successful implementation ofsuch RDM services

Another report worth mentioning in this context isthe final report of the LIBER working group on E-Science Research Data Management (Christensen-Dalsgaard 2012) that concluded with ten recom-mendations to libraries for providing RDM support

Existing literature and studies on Research DataManagement services indicate quite clearly that there

are new opportunities for libraries and library pro-fessionals in this area Library professionals havealready established themselves as experts inmetadata data curation and preservation tech-niques and hence can now extend their role to re-search data management also The library profes-sionals can not only create infrastructure for the re-search data management but also extend help indesigning institutional policies and frameworksbuild a bridge between administrative staff and re-searchers in developing research data managementservices

4 Research Data Registry Repository andSoftware

Prior to embarking on development of a RDM ser-vice it would be important to study various existinginitiatives in the form of registries and repositoriesA registry would typically list out various researchdata repositories while repositories in themselveswould be hosting the research data Table 1 indi-cates that two main registries that list out researchdata repositories on various topics

Table 1 Research Data Registries

Service About Founded by Number Website

Databib Tool to locate online repositories of Institute of Museum and Library 993 httpdatabiborgindexphp

research data originally sponsored Services hosting by University

by a Sparks Innovation National of Purdue

Leadership Grant

Re3data Registry for research data repository that German Research Foundation Partner with 1093 httpwwwre3dataorg

covers research data from different Berlin School of Library and Information Science

academic discipline GFZ German Research Centre for Geosciences

KIT Library

Purdue University

The popular repositories of research data include Dryad Figshare and Harvard Dataverse Table 2 depictsthe main features of these three research data repositories

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 2: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 485 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

powerful data analysis software have together fa-cilitated the researcher to generate and work withlarge data sets The challenge is in preserving anddisseminating these large data sets for future needsthat would be very valuable to the academic com-munity

At the individual researcher level they are not onlygenerating data but also attempting to preserve forreuse and sharing The researchers either save andmanage the data by preserving it with them or placethe data directly in the open repositories like figshare(wwwfigsharecom) or dryad (wwwdatadryadorg)

Journals publishers have started maintaining theirown data repositories to facilitate its authors to hostdata for preservation verification and managementResearch institutions are also initiating research datamanagement services to preserve and manage datafor future reuse and transparency in research Man-agement of research data has recently emerged as astrategic priority for the university (Pryor 2012)There are a number of universities scientific insti-tutions and governments that are thinking on simi-lar lines and have initiated the building their researchdata infrastructure and management (Pinfield Coxand Smith 2014)

The growing number in the registry of data reposi-tories validates the need for such repositories by theacademic and research fraternity There are about1000 research data repositories registered in eachre3data (httpwwwre3dataorg) and databib(httpdatabiborgindexphp) which are among themajor research data registries Simultaneously thepopularity of services like Figshare and Ddryadwhere individual researchers can keep hisher re-search data also indicate the potential value and needfor research data management

It is in this context that it may be worthwhile to re-view the role of libraries in providing research sup-port to its users Traditionally libraries have beenthe playing the role of a partner in research rightfrom the time a researcher initiates the process ofidentifying a research problem to the final publica-tion of a paper in a journal or presentation in a con-ference Library has a key role in organising man-aging preserving discovering and disseminating ofresearch to a wide and relevant audience Extendingits tradition role in providing research support thelibraries today have opportunity to facilitate man-agement of research data also

This paper explores the role of libraries and libraryprofessionals in Research Data Management (RDM)and also presents a case study of CKAN softwareand its implementation at Indian Institute of Man-agement Ahmedabad which can be used for creat-ing and preserving research data

2 Research Data Management

ldquoResearch data management concerns theorganisation of data from its entry into the researchcycle through to the dissemination and archiving ofvaluable results It aims to ensure reliable verifica-tion of results and permits new and innovative re-search built on existing informationrdquo (Whyte andTedds 2011) A service to manage research data inother words a Research Data Management Service(RDMS) would have to consider the entire life cycleof research data starting from the creation of datato its reuse The UK Data Service provides for aclear depiction of the research data life cycle as shownin Figure 1

- 486 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

Figure 1 Research Data Life Cycle

Image Source httpukdataserviceacukmanage-datalifecycleaspx

Cox and Pinfield (2014) describe research data man-agement service as consisting of different activitiesand processes that include creation storage secu-rity preservation retrieval sharing and re-use ofdata including technical capabilities ethical consid-eration legal issues and governance framework

There are many benefits an institution can derivethrough an effective research data management ser-vice Some of them include

Long term preservation of data provides forvalidation check in the future of the data andthis enhances the credibility and transparencyof the research data used

Research data can be reused by the same re-searchers or maybe even others who may liketo extend the use of such data for purposes un-seen by the initial researcher

Well managed research data can always be up-dated to enhance or extend the understandingof existing research on this data

It is economical to reuse the data leading to sav-ing the time and resources for an institutionand hence providing opportunities to investelsewhere

By opening such research data sets for the pub-lic the visibility of the host institution and itsresearchersmiddot

Research data management service enhancesthe discoverability of such data thereby facili-tating quality research

Designing and development of a research data man-agement service would have to consider many fac-tors including

Identifying the various stakeholders who wouldcontribute manage and use the service

Understanding the needs of these stakeholdersto design a sustainable and user friendly re-search data management service

Reviewing and adopting standards recom-mended in the various guidelines published forthis purpose

Review the need to either develop customisedsoftware to handle the requirements or adoptexisting commercial or open source systems

Study and review the IT infrastructure requiredfor successful implementation of a researchdata management service

Develop institution relevant guidelines and poli-cies for hosting sharing and reuse of researchdata by its researchers

Jones Prior and White (2013) have developed a use-ful working level guide for higher education institu-tions planning to create Research Data Management(RDM) services for their institutions The guide pro-

- 487 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

vides working description of why the RDM servicesare required what are the roles and responsibilitiesof the stakeholders the process of developing theservice and a detailed description of the various com-ponents of the RDM service The guide is also veryuseful for institutions planning to implement RDMservices as it provides relevant links to useful linksto a large body of training materials

It may be worthwhile to look at the experiences ofvarious institutions around the world that have beenoffering RDM services Akers Sferdan Nicholls andGreen (2014) review eight US universities classifiedas research universities engaged in lsquovery highrsquoresearch activity by Carnegie Foundation Theuniversities include Cornell Emory John HopkinsPennsylvania State Purdue Illinois at Urbana ndashchampaign Michigan and Virginia The authorsfound that despite their differences in approachingthe RDM implementation most of the institutionsface a common challenge in developing RDMsupport programs The other major issueshighlighted in the study include challenges inreaching out to the researchers to improve researchdata management practices and seeking funding fornew staff positions and infrastructure It is alsointeresting that in the case of all these universitiesrespective libraries played a prominent role in thedesign and development of RDM services in theirinstitutions

3 Libraries and Research Data Management

To further explore the role of libraries in RDM ser-vices being offered at research institutions it maybe interesting to take note a few interesting papersexisting on this topic Gold (2007) described thepotential role of libraries in managing data with afocus on social science data geo referenced data and

bioinformatics data Henty (2008) surveyed Austra-lian universities to identify the existing data man-agement practices and trends He also explored thepossible roles of libraries and librarians in this con-text Lewis (2010) examined in detail the roles andskills of university librarians in UK in the contextRDM and suggested upskilling of the existing libraryworkforce through education and training on re-search data management

One of the early surveys to study the preparationand attitude of librarians towards research data ser-vice was undertaken by Tenopir Sandusky Allardand Birch (2012) The survey was conducted among223 librarians of the Association of Research Librar-ies (ARL) and the findings indicated that althoughthere was very low percentage of libraries involvedin RDMS offerings the librarians believed that thiswas an important service for academic research li-braries to render Similar findings were reportedfrom the survey conducted by Corrall Kennan andAfzal (2013) among 140 libraries in Australia NewZealand Ireland and United Kingdom They alsofound that RDM service represents a relatively newdevelopment in library service offering though therewas an interest among the libraries to offer RDSwith a high proportion of libraries in the process ofplanning to offer RDM services support

Meanwhile Tenopir Sandusky Allard and Birch(2014) extended their previous study to survey andunderstand the perceptions of 223 library directorsin US and Canadian libraries towards RDM serviceThey found that RDMS was not frequently employedin libraries but there were academic and researchlibraries that were already offering RDM serviceswith more planning to initiate RDMS in the nextcouple of years There was a small but growingnumber of libraries that were becoming more in-

- 488 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

volved in RDM by helping with data managementplans and preparing and preserving research data

To examine the contribution of academic librariesto research data management (Pinfield Cox andSmith 2014) interviewed (semi structured) 26 librarystaff from different UK institutions The study foundthat though libraries were playing an important rolein RDM there was a lack of consistent support fromvarious stakeholders at the institution The studyalso identified various factors and issues that wereimportant for successful RDM service implementa-tion The study based on its findings proposed anew model for a RDM programme that could helpovercome barriers to successful implementation ofsuch RDM services

Another report worth mentioning in this context isthe final report of the LIBER working group on E-Science Research Data Management (Christensen-Dalsgaard 2012) that concluded with ten recom-mendations to libraries for providing RDM support

Existing literature and studies on Research DataManagement services indicate quite clearly that there

are new opportunities for libraries and library pro-fessionals in this area Library professionals havealready established themselves as experts inmetadata data curation and preservation tech-niques and hence can now extend their role to re-search data management also The library profes-sionals can not only create infrastructure for the re-search data management but also extend help indesigning institutional policies and frameworksbuild a bridge between administrative staff and re-searchers in developing research data managementservices

4 Research Data Registry Repository andSoftware

Prior to embarking on development of a RDM ser-vice it would be important to study various existinginitiatives in the form of registries and repositoriesA registry would typically list out various researchdata repositories while repositories in themselveswould be hosting the research data Table 1 indi-cates that two main registries that list out researchdata repositories on various topics

Table 1 Research Data Registries

Service About Founded by Number Website

Databib Tool to locate online repositories of Institute of Museum and Library 993 httpdatabiborgindexphp

research data originally sponsored Services hosting by University

by a Sparks Innovation National of Purdue

Leadership Grant

Re3data Registry for research data repository that German Research Foundation Partner with 1093 httpwwwre3dataorg

covers research data from different Berlin School of Library and Information Science

academic discipline GFZ German Research Centre for Geosciences

KIT Library

Purdue University

The popular repositories of research data include Dryad Figshare and Harvard Dataverse Table 2 depictsthe main features of these three research data repositories

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 3: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 486 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

Figure 1 Research Data Life Cycle

Image Source httpukdataserviceacukmanage-datalifecycleaspx

Cox and Pinfield (2014) describe research data man-agement service as consisting of different activitiesand processes that include creation storage secu-rity preservation retrieval sharing and re-use ofdata including technical capabilities ethical consid-eration legal issues and governance framework

There are many benefits an institution can derivethrough an effective research data management ser-vice Some of them include

Long term preservation of data provides forvalidation check in the future of the data andthis enhances the credibility and transparencyof the research data used

Research data can be reused by the same re-searchers or maybe even others who may liketo extend the use of such data for purposes un-seen by the initial researcher

Well managed research data can always be up-dated to enhance or extend the understandingof existing research on this data

It is economical to reuse the data leading to sav-ing the time and resources for an institutionand hence providing opportunities to investelsewhere

By opening such research data sets for the pub-lic the visibility of the host institution and itsresearchersmiddot

Research data management service enhancesthe discoverability of such data thereby facili-tating quality research

Designing and development of a research data man-agement service would have to consider many fac-tors including

Identifying the various stakeholders who wouldcontribute manage and use the service

Understanding the needs of these stakeholdersto design a sustainable and user friendly re-search data management service

Reviewing and adopting standards recom-mended in the various guidelines published forthis purpose

Review the need to either develop customisedsoftware to handle the requirements or adoptexisting commercial or open source systems

Study and review the IT infrastructure requiredfor successful implementation of a researchdata management service

Develop institution relevant guidelines and poli-cies for hosting sharing and reuse of researchdata by its researchers

Jones Prior and White (2013) have developed a use-ful working level guide for higher education institu-tions planning to create Research Data Management(RDM) services for their institutions The guide pro-

- 487 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

vides working description of why the RDM servicesare required what are the roles and responsibilitiesof the stakeholders the process of developing theservice and a detailed description of the various com-ponents of the RDM service The guide is also veryuseful for institutions planning to implement RDMservices as it provides relevant links to useful linksto a large body of training materials

It may be worthwhile to look at the experiences ofvarious institutions around the world that have beenoffering RDM services Akers Sferdan Nicholls andGreen (2014) review eight US universities classifiedas research universities engaged in lsquovery highrsquoresearch activity by Carnegie Foundation Theuniversities include Cornell Emory John HopkinsPennsylvania State Purdue Illinois at Urbana ndashchampaign Michigan and Virginia The authorsfound that despite their differences in approachingthe RDM implementation most of the institutionsface a common challenge in developing RDMsupport programs The other major issueshighlighted in the study include challenges inreaching out to the researchers to improve researchdata management practices and seeking funding fornew staff positions and infrastructure It is alsointeresting that in the case of all these universitiesrespective libraries played a prominent role in thedesign and development of RDM services in theirinstitutions

3 Libraries and Research Data Management

To further explore the role of libraries in RDM ser-vices being offered at research institutions it maybe interesting to take note a few interesting papersexisting on this topic Gold (2007) described thepotential role of libraries in managing data with afocus on social science data geo referenced data and

bioinformatics data Henty (2008) surveyed Austra-lian universities to identify the existing data man-agement practices and trends He also explored thepossible roles of libraries and librarians in this con-text Lewis (2010) examined in detail the roles andskills of university librarians in UK in the contextRDM and suggested upskilling of the existing libraryworkforce through education and training on re-search data management

One of the early surveys to study the preparationand attitude of librarians towards research data ser-vice was undertaken by Tenopir Sandusky Allardand Birch (2012) The survey was conducted among223 librarians of the Association of Research Librar-ies (ARL) and the findings indicated that althoughthere was very low percentage of libraries involvedin RDMS offerings the librarians believed that thiswas an important service for academic research li-braries to render Similar findings were reportedfrom the survey conducted by Corrall Kennan andAfzal (2013) among 140 libraries in Australia NewZealand Ireland and United Kingdom They alsofound that RDM service represents a relatively newdevelopment in library service offering though therewas an interest among the libraries to offer RDSwith a high proportion of libraries in the process ofplanning to offer RDM services support

Meanwhile Tenopir Sandusky Allard and Birch(2014) extended their previous study to survey andunderstand the perceptions of 223 library directorsin US and Canadian libraries towards RDM serviceThey found that RDMS was not frequently employedin libraries but there were academic and researchlibraries that were already offering RDM serviceswith more planning to initiate RDMS in the nextcouple of years There was a small but growingnumber of libraries that were becoming more in-

- 488 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

volved in RDM by helping with data managementplans and preparing and preserving research data

To examine the contribution of academic librariesto research data management (Pinfield Cox andSmith 2014) interviewed (semi structured) 26 librarystaff from different UK institutions The study foundthat though libraries were playing an important rolein RDM there was a lack of consistent support fromvarious stakeholders at the institution The studyalso identified various factors and issues that wereimportant for successful RDM service implementa-tion The study based on its findings proposed anew model for a RDM programme that could helpovercome barriers to successful implementation ofsuch RDM services

Another report worth mentioning in this context isthe final report of the LIBER working group on E-Science Research Data Management (Christensen-Dalsgaard 2012) that concluded with ten recom-mendations to libraries for providing RDM support

Existing literature and studies on Research DataManagement services indicate quite clearly that there

are new opportunities for libraries and library pro-fessionals in this area Library professionals havealready established themselves as experts inmetadata data curation and preservation tech-niques and hence can now extend their role to re-search data management also The library profes-sionals can not only create infrastructure for the re-search data management but also extend help indesigning institutional policies and frameworksbuild a bridge between administrative staff and re-searchers in developing research data managementservices

4 Research Data Registry Repository andSoftware

Prior to embarking on development of a RDM ser-vice it would be important to study various existinginitiatives in the form of registries and repositoriesA registry would typically list out various researchdata repositories while repositories in themselveswould be hosting the research data Table 1 indi-cates that two main registries that list out researchdata repositories on various topics

Table 1 Research Data Registries

Service About Founded by Number Website

Databib Tool to locate online repositories of Institute of Museum and Library 993 httpdatabiborgindexphp

research data originally sponsored Services hosting by University

by a Sparks Innovation National of Purdue

Leadership Grant

Re3data Registry for research data repository that German Research Foundation Partner with 1093 httpwwwre3dataorg

covers research data from different Berlin School of Library and Information Science

academic discipline GFZ German Research Centre for Geosciences

KIT Library

Purdue University

The popular repositories of research data include Dryad Figshare and Harvard Dataverse Table 2 depictsthe main features of these three research data repositories

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 4: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 487 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

vides working description of why the RDM servicesare required what are the roles and responsibilitiesof the stakeholders the process of developing theservice and a detailed description of the various com-ponents of the RDM service The guide is also veryuseful for institutions planning to implement RDMservices as it provides relevant links to useful linksto a large body of training materials

It may be worthwhile to look at the experiences ofvarious institutions around the world that have beenoffering RDM services Akers Sferdan Nicholls andGreen (2014) review eight US universities classifiedas research universities engaged in lsquovery highrsquoresearch activity by Carnegie Foundation Theuniversities include Cornell Emory John HopkinsPennsylvania State Purdue Illinois at Urbana ndashchampaign Michigan and Virginia The authorsfound that despite their differences in approachingthe RDM implementation most of the institutionsface a common challenge in developing RDMsupport programs The other major issueshighlighted in the study include challenges inreaching out to the researchers to improve researchdata management practices and seeking funding fornew staff positions and infrastructure It is alsointeresting that in the case of all these universitiesrespective libraries played a prominent role in thedesign and development of RDM services in theirinstitutions

3 Libraries and Research Data Management

To further explore the role of libraries in RDM ser-vices being offered at research institutions it maybe interesting to take note a few interesting papersexisting on this topic Gold (2007) described thepotential role of libraries in managing data with afocus on social science data geo referenced data and

bioinformatics data Henty (2008) surveyed Austra-lian universities to identify the existing data man-agement practices and trends He also explored thepossible roles of libraries and librarians in this con-text Lewis (2010) examined in detail the roles andskills of university librarians in UK in the contextRDM and suggested upskilling of the existing libraryworkforce through education and training on re-search data management

One of the early surveys to study the preparationand attitude of librarians towards research data ser-vice was undertaken by Tenopir Sandusky Allardand Birch (2012) The survey was conducted among223 librarians of the Association of Research Librar-ies (ARL) and the findings indicated that althoughthere was very low percentage of libraries involvedin RDMS offerings the librarians believed that thiswas an important service for academic research li-braries to render Similar findings were reportedfrom the survey conducted by Corrall Kennan andAfzal (2013) among 140 libraries in Australia NewZealand Ireland and United Kingdom They alsofound that RDM service represents a relatively newdevelopment in library service offering though therewas an interest among the libraries to offer RDSwith a high proportion of libraries in the process ofplanning to offer RDM services support

Meanwhile Tenopir Sandusky Allard and Birch(2014) extended their previous study to survey andunderstand the perceptions of 223 library directorsin US and Canadian libraries towards RDM serviceThey found that RDMS was not frequently employedin libraries but there were academic and researchlibraries that were already offering RDM serviceswith more planning to initiate RDMS in the nextcouple of years There was a small but growingnumber of libraries that were becoming more in-

- 488 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

volved in RDM by helping with data managementplans and preparing and preserving research data

To examine the contribution of academic librariesto research data management (Pinfield Cox andSmith 2014) interviewed (semi structured) 26 librarystaff from different UK institutions The study foundthat though libraries were playing an important rolein RDM there was a lack of consistent support fromvarious stakeholders at the institution The studyalso identified various factors and issues that wereimportant for successful RDM service implementa-tion The study based on its findings proposed anew model for a RDM programme that could helpovercome barriers to successful implementation ofsuch RDM services

Another report worth mentioning in this context isthe final report of the LIBER working group on E-Science Research Data Management (Christensen-Dalsgaard 2012) that concluded with ten recom-mendations to libraries for providing RDM support

Existing literature and studies on Research DataManagement services indicate quite clearly that there

are new opportunities for libraries and library pro-fessionals in this area Library professionals havealready established themselves as experts inmetadata data curation and preservation tech-niques and hence can now extend their role to re-search data management also The library profes-sionals can not only create infrastructure for the re-search data management but also extend help indesigning institutional policies and frameworksbuild a bridge between administrative staff and re-searchers in developing research data managementservices

4 Research Data Registry Repository andSoftware

Prior to embarking on development of a RDM ser-vice it would be important to study various existinginitiatives in the form of registries and repositoriesA registry would typically list out various researchdata repositories while repositories in themselveswould be hosting the research data Table 1 indi-cates that two main registries that list out researchdata repositories on various topics

Table 1 Research Data Registries

Service About Founded by Number Website

Databib Tool to locate online repositories of Institute of Museum and Library 993 httpdatabiborgindexphp

research data originally sponsored Services hosting by University

by a Sparks Innovation National of Purdue

Leadership Grant

Re3data Registry for research data repository that German Research Foundation Partner with 1093 httpwwwre3dataorg

covers research data from different Berlin School of Library and Information Science

academic discipline GFZ German Research Centre for Geosciences

KIT Library

Purdue University

The popular repositories of research data include Dryad Figshare and Harvard Dataverse Table 2 depictsthe main features of these three research data repositories

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 5: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 488 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

volved in RDM by helping with data managementplans and preparing and preserving research data

To examine the contribution of academic librariesto research data management (Pinfield Cox andSmith 2014) interviewed (semi structured) 26 librarystaff from different UK institutions The study foundthat though libraries were playing an important rolein RDM there was a lack of consistent support fromvarious stakeholders at the institution The studyalso identified various factors and issues that wereimportant for successful RDM service implementa-tion The study based on its findings proposed anew model for a RDM programme that could helpovercome barriers to successful implementation ofsuch RDM services

Another report worth mentioning in this context isthe final report of the LIBER working group on E-Science Research Data Management (Christensen-Dalsgaard 2012) that concluded with ten recom-mendations to libraries for providing RDM support

Existing literature and studies on Research DataManagement services indicate quite clearly that there

are new opportunities for libraries and library pro-fessionals in this area Library professionals havealready established themselves as experts inmetadata data curation and preservation tech-niques and hence can now extend their role to re-search data management also The library profes-sionals can not only create infrastructure for the re-search data management but also extend help indesigning institutional policies and frameworksbuild a bridge between administrative staff and re-searchers in developing research data managementservices

4 Research Data Registry Repository andSoftware

Prior to embarking on development of a RDM ser-vice it would be important to study various existinginitiatives in the form of registries and repositoriesA registry would typically list out various researchdata repositories while repositories in themselveswould be hosting the research data Table 1 indi-cates that two main registries that list out researchdata repositories on various topics

Table 1 Research Data Registries

Service About Founded by Number Website

Databib Tool to locate online repositories of Institute of Museum and Library 993 httpdatabiborgindexphp

research data originally sponsored Services hosting by University

by a Sparks Innovation National of Purdue

Leadership Grant

Re3data Registry for research data repository that German Research Foundation Partner with 1093 httpwwwre3dataorg

covers research data from different Berlin School of Library and Information Science

academic discipline GFZ German Research Centre for Geosciences

KIT Library

Purdue University

The popular repositories of research data include Dryad Figshare and Harvard Dataverse Table 2 depictsthe main features of these three research data repositories

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 6: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 489 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Table 2 Research Data Repositories

Service About Founded by Num ber Deposit Policy Website

Dryad Curated general purpose repository Is nonprofit membership Data Package User can submit their httpdatadryadorg

that makes the data underlying organization hosted at North 7407KNB 24249 data to this repository

scientific publication discoverable Corolina State University TreeBASE2636 there is also an option

freely reusable and citableThere where organization can

is a membership for organization take membership of dryad

to submit their data to dryad

repository

Figshare is repository where user can make Is an independent body 1120830 Files User can submit their httpfigsharecom

all of their research outputs available supported by digital science data as open access are

in a citable sharable and discoverable free private storage there

manner are charges

Harvard free and open to all researchers Institute of Quantitative Social 755386 Files Researcher can upload data https

Dataverse worldwide to share cites Science at Harvard University 886 data into Dataverse network thedataharvardedu

Network reuse and archive research data repositories free up to 1 TB dvn

KNB= Knowledge Network for BiocomplexityTreeBASE= A repository of phylogenetic Informa-tion

One of the most important factors in successfulimplementation of a RDM service is selection of the

software to effectively manage the research dataThere are options available from the basket of OpenSource software and include Databank CKAN andDataverse Table 3 includes these open source op-tions along with relevant website links

Table 3 Open Source Software Solutions for RDM

Software Created by Software and Platform Website

Databank Oxford Bodleian Libraries Open Source SoftwareLinux Platform httpwwwdataflowoxacukindexphpdatabank

CKAN Open Knowledge Foundation Open Source SoftwareLinux Platform httpckanorg

Dataverse Institute of Quantitative Social Open Source SoftwareLinux Platform httpdataverseorg

Science at Harvard University

DataBank DataBank is a scalable data repositorydesigned for institutional deployment It is an opensource project that promotes free to use cloudhosted systems for management preservation andpublication of research data sets DataBank wascreated by Oxford Bodleian Libraries

CKAN The Comprehensive Knowledge ArchiveNetwork (CKAN) is an open source data manage-ment platform adopted by numerous governmentsorganizations and communities around the worldCKAN is supported by Open Knowledge Founda-tion (OKFN) and is one of the most popular re-search data management software available Thereare 116 CKAN instances around the world which

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 7: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 490 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

cover all types of organizations including Govern-mentLocal GovernmentAcademicCommunityand Other organizations Most of the Governmentopen data sites are run on CKAN software Thereare very few academic institutions who have adoptedCKAN for their research data management needs

5 Research Data Management Repository at IIMA

Indian Institute of Management Ahmedabad is oneof the leading business schools in India and theworld Its library Vikram Sarabhai Library (VSL)services about 100 core faculty members 100 re-search students 1000 students and a good numberresearch associates working at IIM Ahmedabad VSLalso caters to researchers from around the countrymainly in the subject of business and management

Business and management researchers need differ-ent kind of data sets for their research which theysource from subscribed resources open data or pri-mary data collected by themselves through surveysand other instruments The library and its staff playa key facilitating role in research data collection inthe context of subscribed and open sources In ad-dition to identifying the sources library staff is ac-tively involved in downloading collecting organisingand disseminating the datasets to the library usersas per individual needs These datasets can includeraw data processed data or analysed data Over aperiod of time with increased focus on research atthe Institute large volumes of datasets are expectedto be generated The challenge would be in appro-priately preserving these datasets for future accessand reuse

It is pertinent that the library and academic researchfraternity at Indian Institute of ManagementAhmedabad requires implementing a service thatwill be helpful to preserve the data in a standardised

format for long-term access and reuse An effectiveresearch data management service maybe the ulti-mate solution to this challenge and this paper pro-poses a RDM service for the Institute

51 CKAN - Software solution adopted for theRDM service at IIMA

The existing open source software options were re-viewed and the options considered for this reviewincluded Harvard Dataverse network Databank andCKAN One of the major factors in favour of CKANwas that there were many data repositories adopt-ing it and it seemed to be quite popular with variousnational governments for hosting their data reposi-tories In addition to excellent features CKAN alsohave a very active user community and this was oneof the main reasons why it was selected to developthe IIMA RDM service

CKAN had all the important function like data stor-age licensing metadata persistent URL authenti-cation etc which are very much required in researchdata management The major features are

Integration with the institutional research envi-ronment (eg hooks into CRIS system Institu-tional Repository DMP Online networked stor-age)

Capturing the research processcontextactiv-ity notation not just data

Controlled access to research partners

Good comprehensive search tools

Version control for data and metadata

Customisable extensible metadata

Adherence to data standards eg RDF

Multi-level access policies

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 8: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 491 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Secure backed up scalable file storage foranywhere access to files and file sharing (egDropbox)

Command-line tools and good web UI fordepositupdate of data

Permanent URIs for citation eg DOIs

Importexport of common data formats

Linking datasets (by project type researchoutput person etc)

Rightslicense management

Commercial supportwidely used popularplatform (

Documentation for installation andcustomization is not comprehensive and couldbe a barrier for non-IT users

52 Installation and Customization

Installation of CKAN can be done through threeprocedures (1) Installation from an operating sys-tem package (2) install from source (3) install using aDocker image The recommended operating sys-tem for CKAN is Ubuntu 1204 64bit CKAN is writ-ten in Python use Solr for search and relational da-tabase is PostgreSQL A detailed manual has beengiven in the CKAN website for each procedure onecan use these procedures for installing CKAN forthe respective data repository

53 Creating a IIMA Dataset Repository

One may need to understand that dataset is unit ofdata or group of data and it can demographic dataof a country financial data economic data etc Eachdataset comprises of two parts first one is themetadata of dataset which contains the followingfields

Title URL Description Tags License OrganizationVisibility State Source Author maintainer

The second part of the dataset is the data whichneeds to be managed CKAN supports most of thedata formats that include Excel CSV PDF XMLRDF etc The data could physically be stored inter-nally or on an external link could be provided to thedata host It may be noted that a dataset may con-tain multiple types of data and each can be addedseparately

54 Policy and Planning

CKAN provides for one master administrator whohas full rights in designing the system There canunlimited number of users of this system Beforeadding datasets it is required that the administratorcreates lsquoorganizationsrsquo and lsquogroupsrsquo Eachorganisation has its own administrator who candecide the rights and responsibilities of registeredusers in that organisation Each dataset is owned byan lsquoorganizationrsquo Each lsquoorganizationrsquo could have itsown workflow and authorization procedureDatasets can be made private or public facilitatingorganisation administrator to decide whether adataset can be limited to specific organizations ornot Administrator of the organization can also as-sign the role of editor or viewer to each registereduser of that organisation

The content of the RDM service for IIMA that hasbeen initially planned includes

Datasets produced by IIMA researchers for theirscholarly research and project and it may in-clude raw data processed data or analysed data

Datasets generated from open access orsusbcribed sources by the library staff from dif-

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 9: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 492 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

ferent sources and compiled in a format thatwill be useful to the researchers

The major objectives of a RDM service of IIMA arearticulated as

Long term preservation of datasets generatedby its researchers

Sharing data sets for collaborative research

Develop an archive of datasets already com-piled in response to earlier reference queries soas to facilitate reuse of the datasets not only by

the researchers but also by the reference staffin addressing future reference requests

Standardize various datasets and integrate theminto a single platform to facilitate search anddiscovery in addition to preservation

Assist the researchers in avoiding duplicationefforts in compiling datasets

Status of IIMA RDM Service

Currently the IIMA research data repository has 62data sets in 2 organisations

Figure 2 Home Page of Data Repository

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 10: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 493 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

Figure 3 Data Sets

Figure 4 Details of data set

Figure 5 Data Set Preview

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 11: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 494 -

10th International CABLIBER 2015 Managing Research Data in Academic Institutions

6 Conclusion

Research Data Management service is an emergingas an important offering by academic Infrastruc-ture requirements policy making planning may dif-fer from institution to institution in the context ofRDM services However one common critical suc-cess factor in the success of a RDM service is theactive participation of the major stakeholders likedirector administrator library IT staff and research-ers It is also important that a policy is developedand stated at the institutional level to give the rightimpetus and direction to such initiatives Within theavailable options institutions can create policies thatindicate their desire to share their data openly orrestrict the use for private communities

The sharing of data is now increasingly becoming aglobally accepted objective with governments man-dating Open Access of data and research funded byit The opening up of the World Bank data is a testi-mony to the fact that international organisations arealso working on opening up access to their data

At IIMA research is being emphasised for facultyrecruitment and evaluation leading to a situationwherein large volumes of research data would begenerated In addition to this type of data an in-crease in research data based reference services bythe library has forced the implementation of RDMservice in the library Instead of waiting to developinstitutional policies for sharing or securing the li-brary has embarked on a path to create a researchdataset repository first and then look at policy is-sues relating to access The library is sure of the useof this RDM service for its users and also to thereference staff in the library

Selecting the software CKAN did involve consider-able time and was finally identified to be appropri-ate as it was the best available in the open sourcedomain As the importance and implementation ofsuch services increase among the libraries in future

we will see that a number of technologies and toolswould be available for adoption It is envisaged thatthe research community of IIMA will find the exist-ing RDM service useful and lead to more researchat the Institute

References

1 Akers K G Sferdean F C Nicholls N H ampGreen J A (2014) Building Support for Re-search Data Management Biographies of EightResearch Universities International Journal ofDigital Curation 9(2) 171ndash191 doi102218ijdcv9i2327

2 Christensen-Dalsgaard B (2012) Ten recom-mendations for libraries to get started with re-search data management Final report of theLIBER working group on E-Science ResearchData Management Retrieved January 30 2014from httpwwwlibereuropeeusitesdefaultfilesThe20research20data20group20201220v720finalpdf

3 Corrall S Kennan M A amp Afzal W (2013)Bibliometrics and Research Data ManagementServices Emerging Trends in Library Supportfor Research Library Trends 61(3) 636ndash674doi101353lib20130005

4 Cox A M amp Pinfield S (2013) Research datamanagement and libraries Current activities andfuture priorities Journal of Librarianship andInformation Science 0961000613492542doi1011770961000613492542

5 EPSRC Policy Framework on Research DataScope and benefitsEngineering and PhysicalSciences Research Council 2011

Available from httpwwwepsrcacukaboutstandardsresearchdatascope

6 Gold A (2007) Cyberinfrastructure data andlibraries part 2 Libraries and the data challengeRoles and actions for libraries D-Lib Magazine

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin

Page 12: Managing Research Data in Academic Institutions: Role of ...eprints.rclis.org/24911/1/50.pdf · One of the global emerging trends in academic libraries is to facilitate the management

- 495 -

Managing Research Data in Academic Institutions 10th International CABLIBER 2015

13(910) httpwwwdliborgdlibseptember07gold09gold-pt2html

7 Henty M (2008) Dreaming of data Thelibraryrsquos role in supporting e-research and datamanagement Australian Library and Informa-tion Association Biennial Conference AliceSprings Available at httpapsranueduaupresentationshenty_alia_08pdf

8 Jones S Pryor G amp Whyte A (2013) lsquoHow toDevelop Research Data Management Services- a guide for HEIsrsquo DCC How-to GuidesEdinburgh Digital Curation Centre Availableat httpwwwdccacukresourceshow-guidesh o w - d e v e l o p - r d m -servicessthashkfiImTyydpuf

9 Lewis MJ (2010) Libraries and the manage-ment of research data In McKnight S editorEnvisioning future academic library servicesLondon Facet Available at httpe p r i n t s w h i t e r o s e a c u k 1 1 1 7 1 1 LEWIS_Chapter_v10pdf

10 Pinfield S Cox A M amp Smith J (2014) Re-search Data Management and Libraries Rela-tionships Activities Drivers and InfluencesPLoS ONE 9(12) e114734 doi101371journalpone0114734

11 Pryor G (ed) (2012) Managing Research DataLondon Facet

12 Tenopir C Sandusky R J Allard S amp BirchB (2014) Research data management servicesin academic research libraries and perceptionsof librarians Library amp Information Science Re-search 36(2) 84ndash90 doi101016jlisr201311003

13 Tenopir C Sandusky R J Allard S amp BirchB (2012) Academic librarians and research dataservices preparation and attitudes IFLA Jour-nal 39(1) 70ndash78 doi1011770340035212473089

14 UK Data Service Research data life cycle Avail-able at httpukdataserviceacukmanage-datalifecycleaspx

15 Univeristy of Lincoln Orbital Project Choos-ing CKAN for research data managementAvailable at httporbitalblogslincolnacuk20120906choosing-ckan-for-research-data-management

16 Whyte A Tedds J (2011) lsquoMaking the Casefor Research Data Managementrsquo DCC BriefingPapers Edinburgh Digital Curation CentreAvailable online httpwwwdccacukre-sourcesbriefing-papers - See more at httpwwwdccacukresourcesbriefing-papersmak-ing-case-rdmsthashY9tAoGFudpuf

Website visited

1 httpckanorg

2 httpdatabiborgindexphp

3 httpdatadryadorg

4 httpfigsharecom

5 httpwwwdataflowoxacukindexphpdatabank

6 httpwwwre3dataorg

7 httpsthedataharvardedudvn

About Authors

Mr Mallikarjun Dora Professional AssistantIndian Institute of Management AhmedabadVastrapur AhmedabadEmailmallikarjuniimahdernetin

Dr H Anil Kumar Librarian Indian Institute ofManagement Ahmedabad VastrapurAhmedabadEmailanilkumariimahdernetin