eurosakai clif project presentation

36
Enabling the digital content lifecycle: content flow between Sakai and Fedora Chris Awre Library and Learning Innovation EuroSakai Amsterdam, 27 th September 2011 1

Upload: chris-awre

Post on 11-May-2015

424 views

Category:

Education


5 download

DESCRIPTION

A presentation given at the EuroSakai 2011 conference in Amsterdam on 27th September 2011. It covers the work of the CLIF project to investigate the management of the digital lifecycle across systems, using the integration of the Sakai collaboration and learning environment with the Fedora digital repository system as an exemplar.

TRANSCRIPT

Page 1: EuroSakai CLIF project presentation

Enabling the digital content lifecycle: content flow between Sakai and FedoraChris Awre

Library and Learning Innovation

EuroSakai

Amsterdam, 27th September 2011

1

Page 2: EuroSakai CLIF project presentation

CLIF Project

CLIF - Content Lifecycle Integration Framework

Funded by JISC

01 July 2009 – 31 March 2011

Project partners

University of Hull

King’s College London

Centre for e-Research (CeRch)

2

Page 3: EuroSakai CLIF project presentation

Background

• CLIF is building on work within the JISC-funded RepoMMan and REMAP projects• In particular, REMAP explored how a repository

could support records management and digital preservation as part of a lifecycle management approach for digital content

• Previous work had sought to push the repository upstream in the workflow• Dilemma was that the repository risked becoming

another content silo alongside other content management systems on campus (in our case, Sakai and SharePoint)

• How can the repository become more integrated in the institutional environment?3

Page 4: EuroSakai CLIF project presentation

Fedora

• Powerful digital repository framework• Adopted at University of Hull in 2005• Live institutional repository since 2008

• Developed and managed through DuraSpace• Strong community model, akin to Sakai

• Features we like (the advert!)• Powerful digital object model• Extensible metadata management• Expressive inter-object relationships• Version management• Configurable security architecture

4

Page 5: EuroSakai CLIF project presentation

5

Local repository need

• Scalable solution (not one that has upper limit) Digital content is only going to grow

• Standards-based (open standards where possible) To provide a future-proof exit strategy

• Content agnosticism We don’t know what types of content may come

along• Content semantics

Recording the relationships between different pieces of content supports future use and preservation

Page 6: EuroSakai CLIF project presentation

6

Other repository systems?

• The focus of the work was based around systems that were in place at Hull• Other repository options were not actively considered

• Following on from work looking at integration of DSpace and Sakai through CTREP project• Aimed to achieve the same end goal of seamless

integration for Fedora• Regardless of the system, it is important to

understand what you are trying to achieve in the management of content through integration• Repository choice driven by external factors of how

repository management is carried out

Page 7: EuroSakai CLIF project presentation

CTREP

• CTREP project was a JISC-funded project, 2007-9• Aimed to increase repository usage through integration within

the LMS, using Sakai as the platform• Cambridge examined integration with DSpace• University of Highlands & Islands (UHI) examined integration

with Fedora• Work focused on use of Sakai ContentHostingHandler• DSpace work successful, albeit that information being sent

between the two was limited• Fedora work halted as it became clear that the version of Sakai

CHH at the time was not able to deal with rich Fedora objects• Re-visiting this has been possible through Sakai developments

• We are grateful to CTREP for pioneering this approach

7

Page 8: EuroSakai CLIF project presentation

Lifecycle

Lifecyclemanagementwithin arepository

8

Can this beenabled acrosssystems?

Page 9: EuroSakai CLIF project presentation

Lifecycle integration

9

Content flows between systems according to need in lifecycle

Sakai SharePoint

Repository

Page 10: EuroSakai CLIF project presentation

10

Sakai and content management• Content management for teaching & learning

makes heavy use of the Resources tool• Some imaginative ways used for how content from

here is used by other tools within the system• Content is also shared between sites, and staff are

encouraged to make their content shareable• Focus of content management is to support use

within Sakai• Focus is on Sakai, not the content• A content silo?

• How could integration with a content store – a repository – enhance how Sakai manages and uses content?

Page 11: EuroSakai CLIF project presentation

CLIF project objectives

• Understand how digital content can be managed across systems as part of the digital content lifecycle• Recognising that individual systems cannot always

support the whole lifecycle from creation to preservation or deletion

• Specifically investigate the role of repositories in the digital content lifecycle• Where is the repository best positioned within the

lifecycle?• What roles can digital repositories play?

• Understand how content will flow in and out of a repository as part of the lifecycle• CLIF has been agnostic about this

11

Page 12: EuroSakai CLIF project presentation

CLIF use cases I

• Use cases cover research, teaching and administration

• Based on interviews with staff at partner institutions• Academic staff (Head of Department / Senior

Lecturer)• Records Manager• Research active staff

• Interviews highlighted that staff were managing as best they could within single systems they were familiar with• Potential to exploit additional functionality in other

systems welcomed12

Page 13: EuroSakai CLIF project presentation

CLIF use cases II

• Research• Capturing data produced through experimental equipment

and archiving this for use in future work in the repository• Preparation of research outputs and archiving of these for

dissemination• Teaching

• Teaching materials accessed from within a repository to inform current courses

• Exam papers created in one system and archived for future reference in the repository (marks could be archived for private access as well)

• Administration• Committee papers circulated to committee members before

a meeting are moved to the repository for wider access post-meeting

13

Page 14: EuroSakai CLIF project presentation

CLIF outputs

• Literature review on managing the digital content lifecycle across systems

• Technology integrations as exemplars of how a repository can support lifecycle management across systems• Fedora – Sakai integration• Fedora – SharePoint integration• Software available on GitHub• Technical appendix to final report describing

architecture and implementation

14

Page 15: EuroSakai CLIF project presentation

A digital content lifecycle

15

© Digital Curation Centre

There are many variations andversions of lifecycle models

- another is not required

Each has a number of stages

CLIF sought to capture use casesthat encompassed a number ofthese stages and tested how theycould be managed across systems

Page 16: EuroSakai CLIF project presentation

Literature review

• There was little literature directly addressing the system aspects of managing the digital content lifecycle• Work was focused within a system or was more

architecture-based without addressing specific systems• Possibly due to flux in technology development

• Terminology is key to addressing lifecycle management• There are many different lifecycles (knowledge,

digitisation, metadata, etc.) that may overlap• Can be easier to break down the lifecycle into

stages, many of which are common16

Page 17: EuroSakai CLIF project presentation

Lifecycle characteristics

• The use of standards can greatly ease movement between systems• cf. the use of the Hydra digital object approach

• Policy is as important as technology in determining how different systems are used to manage a lifecycle

• Digital preservation can be greatly supported if considered at the beginning of the lifecycle (as REMAP found)

• There is a need to identify how people and roles fit into an overall lifecycle

• It may be valuable to record information about the lifecycle itself as content moves, but this has resource implications• cf. the use of PREMIS events metadata recording what happens to

an object

17

Page 18: EuroSakai CLIF project presentation

Sakai – Fedora integration

• Sakai 2.6.1

• Fedora v3.4

• Extends and enhances the JISC CTREP Fedora ContentHostingHandler

plugin• CHH is a pluggable provider model for hosting content• Content displayed in standard Sakai Resources Tool

• Enabled and Configured by uploading a mountpoint.properties text file

• Resources Tree view shows a ‘live view’ of a specific Fedora collection

• ‘Show other sites’ allows files and/or nested folders to be copied/moved between MyWorkspace site and Fedora mounted site

19

Page 19: EuroSakai CLIF project presentation

20

.properties configuration file

Page 20: EuroSakai CLIF project presentation

21

Sakai to Fedora

Page 21: EuroSakai CLIF project presentation

22

Or…

ContentHostingHandlerImplFedora

ContentHostingHandlerResolverImpl

DBContentService

BaseContentService

CHS API

Resources Tool

Page 22: EuroSakai CLIF project presentation

23

Linking Sakai and Fedora

• Content held in Sakai and Fedora are held very differently• Sakai holds files• Fedora holds objects made up of a collection of datastreams,

one of which is the file (others will contain metadata)• In linking Sakai and Fedora, three considerations

needs to be addressed• Displaying Fedora objects in a tree structure and Fedora

collections as folders• Issue for security around the objects

• Depositing a file in Fedora from Sakai requires a Fedora object with associated metadata to be created

• Retrieving a file from Fedora for use in Sakai requires use of the search capability within Fedora

Page 23: EuroSakai CLIF project presentation

24

Lessons learned

• SOAP messaging between the two systems made the link very slow• Due to use of HTTPS• Switching to HTTP improved performance and

allowed easier debugging• Other performance improvements enabled

included,• Caching of resources and folder objects• Minimising web service calls by sing one call to

retrieve multiple properties• No pre-fetching of datastreams

• The CHH code is over-complicated at times• Impact of changes at high level can be extensive

lower down

Page 24: EuroSakai CLIF project presentation

Sakai – Fedora features

• The repository is embedded as a set of resources that appear like any other set of resources• The majority of menu functions work in the same

manner as with standard resources, e.g., upload, copy, paste, move, delete, create

• This applies to folders as well as individual objects• Folders represent collection objects in the

repository• Metadata can be captured in Sakai for use in

Fedora (though Sakai is not able to re-use this when retrieving an object from Fedora)

• User can browse Fedora collection (though not yet search)

• User does not need to know they are working with the repository

25

Page 25: EuroSakai CLIF project presentation

Fedora 2

• Very flexible – this has made exchanging objects between Fedora instances and between Fedora and other systems difficult

• Common approach to structuring digital objects is required• Systems interacting with Fedora can build objects

using this common approach• CLIF adopted the approach developed through the

Hydra project• http://projecthydra.org/

26

Page 26: EuroSakai CLIF project presentation

27

Fedora 2 contd.

• Common structuring/modelling approach allows for object metadata to be edited in the repository as part of their lifecycle management

• Each object has:• rightsmetadata

• …and could have…• descmetadata (using MODS)

• contentmetadata

• techmetadata

• etc.

• If Sakai can provide this

Page 27: EuroSakai CLIF project presentation

Copy/move to/from Repository

28

Copy & move folders/files between Fedora and MyWorkspace is easy ! Copy…

Page 28: EuroSakai CLIF project presentation

Copy/move to/from Repository

…paste!

29

Page 29: EuroSakai CLIF project presentation

It looks easy, but…

… you don’t see what is going on underneath!

30

© 2008 Richard Green

Page 30: EuroSakai CLIF project presentation

31

Outstanding work

• Managing versions from within Sakai, or accessing them, isn’t currently possible

• Some of the commands under the Edit functionality have no current effect on the object in Fedora

• The metadata captured is minimal, and Sakai cannot make use of metadata added within Fedora

• Folders with large numbers of resources have a noticeable impact on performance when browsing or carrying out actions upon them

Page 31: EuroSakai CLIF project presentation

Evaluation

• There needs to be a clear understanding and view about where the boundaries are between the different systems being used, to avoid confusion

• There needs to be clarity over why different systems are being used, to overcome concerns about having to work with multiple systems

• There is a need for better preservation and a recognition that integrating the repository could support this, but also a need to be clear about what needs preserving

• There is benefit in being able to access other content stores from within your current working environment in order to see what is available more broadly

32

Page 32: EuroSakai CLIF project presentation

33

Sakai-repository evaluation• The seamless access was much valued

• Having access to resources that could be used within Sakai was a valuable addition to being able to browse resources inside Sakai

• Providing access to resources in context was considered very important, hence, linking to the files in the repository instead of copying them across may be preferred• Why create a copy if access is OK where the content is?

• Reference or irregular content was considered to fit best into the model of access via repository

• Bulk movement likely to be more useful than object by object movement

Page 33: EuroSakai CLIF project presentation

34

Sakai OAE

• Focus on presentation of content in context• This tallies with findings in CLIF

• Focus on use of APIs where available• Institutional repository systems are not so good at

this• A challenge for these systems

• Capturing annotations alongside original content would enhance archival records

• Exporting multiple resources, as IMS CP or other, also a route for managing content across systems

Page 34: EuroSakai CLIF project presentation

Conclusions

• Diverse content management systems can be effectively integrated to allow cross-system lifecycle management• Better adoption of interface standards would be helpful

• Standardisation in the structure of the content being moved maximises how the content can be managed by the different systems

• Where the repository is one of the systems involved its current primary role appears to be as a recipient of content (for preservation)• Perception that content in the repository can be used

there without moving it into the other integrated systems

35

Page 35: EuroSakai CLIF project presentation

Demo

36

Copyright © copyright-free-photos.org.uk

Page 36: EuroSakai CLIF project presentation

Thank you

Chris Awre – [email protected]

Richard Green – [email protected]

Andrew Thompson – [email protected]

Simon Waddington – [email protected]

Project website - http://www2.hull.ac.uk/discover/clif.aspx

Project GitHub - https://github.com/uohull/clif-sharepoint and https://github.com/uohull/clif-sakai

Project final report - http://edocs.hull.ac.uk/splash.jsp?parentId=hull:1647%26pid=hull:4194

37