Digital | Curation | Centre
Supporting Digital Curation to safeguard research data: adding value today and ensuring long-term access
Dr Liz Lyon,
DCC Associate Director Outreach
Director, UKOLN, University of Bath, UK
Funded by:
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
JISC Conference
March 2006
2
Digital | Curation | Centre
Overview
• Digital curation and the e-Research cycle• UK Digital Curation Centre
– Development activity– Research agenda– Advisory services – Outreach programme
• Chemistry exemplar projects
“maintaining and adding value to a trusted body of digital information for current and future use”
3
Digital | Curation | Centre
UK Digital Curation Centre
• Development activities
• Research agenda
• Delivering services
• Outreach Programme
• http://www.dcc.ac.uk/
4
Digital | Curation | Centre
DCC people (some of them…)
• Management & Co-ordination– Director Chris Rusbridge (University of Edinburgh)
• Community Support & Outreach– Led by Dr Liz Lyon (UKOLN, University of Bath)
• Service Definition & Delivery– Led by Professor Seamus Ross (HATII, University of Glasgow)
• Development– Led by Dr David Giaretta (Astronomical Software & Services, CCLRC)
• Research– Led by Professor Peter Buneman (University of Edinburgh)
5
Digital | Curation | Centre
(Very simple) e-Research Cycle and Data Curation
Formulate hypothesis / ideas, test, experiment, observe: data creation,
collection & capture
Adding value: Data linking, annotation,
visualisation, simulation
(New) knowledge extraction: data mining, modelling, analysis, synthesis
e-Infrastructure
Open access
Collaboration
Scholarly communications: data disclosure, publication, citation, discovery, re-use
Data management storage & validation: description, deposit,
self-archiving, preservation,
certification
Data processing
Data processingData processing
Data processing
Data processing
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
6
Digital | Curation | Centre
Data capture & integration into research workflows
• R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, deposit of results (chemistry)
• SMART TEA electronic Laboratory notebook + annotations
7
Digital | Curation | Centre
Disciplinary data-centres
8
Digital | Curation | Centre
eBank UK Project• Two key themes:
– Open access to datasets– Linking research data to publications and to learning
• UKOLN, University of Southampton, University of Manchester• e-Science application ‘Combechem’ : Grid-enabled combinatorial
chemistry + National Crystallography Service• Resource Discovery Network / PSIgate physical sciences portal
http://www.ukoln.ac.uk/projects/ebank-uk/
9
Digital | Curation | Centre
A data repository entry
10
Digital | Curation | Centre
Access to the underlying data: complex objects
ecrystals.chem.soton.ac.uk
11
Digital | Curation | Centre
Data descriptions• Validation, publication & discovery
of data models & schema• Metadata packaging standards
– METS– MPEG 21 DIDL
• Semantic descriptions– Formal controlled vocabularies– High-level and domain ontologies– Inter-disciplinary discovery
• Informal approaches Web 2.0 “folksonomies”
12
Digital | Curation | Centre
Audit & certification: trusted digital repositories
• DCC Development & Services teams• Draft Audit Checklist for Certification August
2005 Research Libraries Group RLG-NARA • Pilot audits planned
– Koninklijke Bibliotheek (KB) – British Atmospheric Data Centre (BADC)– JISC Digital Repository projects?– Institutional repositories?
• Revised Checklist based on feedback and pilot audit outcomes
13
Digital | Curation | Centre
Development: Representation Information Registry
• “DCC Approach to Digital Curation” based on the Reference Model for an Open Archival Information System (OAIS); ISO standard, 14721:
• Development of a Representation Information (RI) registry/repository (DCC-RR)
• Prototype demonstrator: based on 2 key concepts to facilitate sharing of the curation effort
– Curation persistent ID– Descriptive “label” (structural, semantic, other metadata)
• Development of tools and interfaces for creating, using and re-using representation information
http://dev.dcc.ac.uk
for details of Wiki and email list
14
Digital | Curation | Centre
Persistent identifiers for data citation
• Warwick Workshop research issue
• Schemes: DOI, Handle, ARK, PURL
• Global identification: express as http URIs• eBank data citation policy (human and
machine-actionable) http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145
• Domain identifiers: e.g. International Chemical Identifier (INChI) codes
15
Digital | Curation | Centre
Discovering data:
Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k
• Domain identifier: International Chemical Identifier (INChI) code• Google molecule using INChISlide from Simon Coles
16
Digital | Curation | Centre
Adding value: eBank linking data to publications
17
Digital | Curation | Centre
Linking research to learning - embedding eBank aggregator service in a science portal for student learners
18
Digital | Curation | Centre
Adding value through annotation
DCC Research at the University of Edinburgh
• Scientific databases: Annotation scoping report
• AstroDAS: distributed annotation servers in astronomy
• New annotation model + prototype MONDRIAN: top-ranked demonstration at recent DB conference
19
Digital | Curation | Centre
Supporting the community: Services
• legal - technical guidance
• Curation Manual 45 chapters planned,
• Briefing Papers• Case studies
20
Digital | Curation | Centre
DCC Case Study published: Wide Field Astronomy Unit
21
Digital | Curation | Centre
Supporting the community: Outreach & Services • Workshops:
• LOCKSS 6 April Warwick• Archiving e-Mail 24 April, Newcastle• Associates Network 17 May, NeSC,
Edinburgh• Digital Curation Policies, June,
Oxford tbc• Data dictionary for Preservation
Metadata (PREMIS), July tbc
• Information Days 2006 Nottingham, Birmingham, Manchester
• 2nd International Conference 21-22 November Glasgow
• Keynotes: Hans F. Hoffmann, CERN, Clifford Lynch, CNI
• Call for papers deadline 3rd April
22
Digital | Curation | Centre
Associates Network
Goals: Develop understanding, share best practice, advance research, promote recognition, develop consensus
376 Members and growing…….
Benefits: Early access to R&D outputs, advisory services, training, input to definition and design, community participation
Discussion Forum www.dcc.ac.uk Topics: formats at risk, Creative Commons, digital archives and digital libraries
Meeting 17 May @ NeSC, Edinburgh Please join us!
Digital | Curation | Centre
Thank you.Questions?
Join the DCC Associates Network at www.dcc.ac.uk