David GiarettaAssociate Director (Development)
forChris Rusbridge (Director)
Funders:
Digital Curation Centre
Digital Curation Centrea centre of expertise in data curation and preservation
Development
2
• DCC: Why?
• DCC: What?
• DCC: Where?
• DCC: Who?
• DCC: How and When? – Progress
• How can we help – each other?
Development
3
Time for Digital Curation
• problem of the moment fragility of digital information recognised data curation & data deluge in e-science/research longevity of digital heritage & research investment
• re-examining ‘Communication’ in ICT Internet and GRID: communication across space
with utmost accuracy
Digital Curation: communication across time, with utmost
accuracy• ensure Content travels despite turbulence of IT
agree strategies & methods for digital preservation
Development
4
Unifying Themes for the DCC
• ‘data as evidence’– for understanding and decision– for one or more designated communities
• ‘archival responsibility’– at one or more institutional levels– institutional policies & individuals’ competence– legal compliance & agreement on procedures
• turn ‘open access’ into ‘continuing access’• turn costs into investment
– valuing flow of benefit from re-usable assets
Development
5
• DCC: Why?
• DCC: What?
• DCC: Where?
• DCC: Who?
• DCC: How and When? – Progress
• How can we help – each other?
Development
6
Digital Curation
• preservation and use/interoperability– preservation is interoperability with the
future
• bits and information
• libraries and science data
• digital information rendered for human reading and automated processing– may be transient distinction but…
Development
7
Aims & Objectives for the DCC
‘quality improvement in data curation & digital preservation’
initial focus: data as evidence for scholarly conclusions
wider remit: scholarly communication & eLearning
• ‘excellence in research & excellence in service’• working with repositories, rather than being one• ‘connecting communities’ via Associates Network
– universities & research institutes– scientific data tradition & document tradition
– international & cross-sectoral
Development
8
• DCC: Why?
• DCC: What?
• DCC: Where?• DCC: Who?
• DCC: How and When? – Progress
• How can we help – each other?
Development
9
Organisation to Engage & Collaborate
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
UKOLN
U of Edinburgh
CCLRC
U of Glasgow
U of Edinburgh
curation organisations eg DPC
Collaborative Associates Network of DataOrganisations
Development
10
Organisation to Engage & Collaborate
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
community support & outreach
research
development co-ordination
service definition & delivery
management & admin support
curation organisations eg DPC
Collaborative Associates Network of DataOrganisations
Development
11
• DCC: Why?
• DCC: What?
• DCC: Where?
• DCC: Who?• DCC: How and When? – Progress
• How can we help – each other?
Development
12
Organisation to Succeed
Phase One leadership over first eight months of funding
• Community Support & Outreach– Led by Dr Liz Lyon (UKOLN, University of Bath)
• Service Definition & Delivery– Led by Professor Seamus Ross (HATII [ERPANET], University of Glasgow)
• Development– Led by Dr David Giaretta (Astronomical Software & Services, CCLRC)
• Research– Led by Professor Peter Buneman (Informatics, University of Edinburgh)
• Management & Co-ordination– Director Chris Rusbridge
• Peter Burnhill had been Phase One Director
‘Ex Portfolio’: Malcolm Atkinson (NeSC)
Development
13
CCLRC UKOLN
UofGUofE
CMS-Bristol
NIEeS
RG
Durham
WT-CFGLeicester
ICMaastricht
Oxford
Dutch NASwiss NAUrbino
UNC
Salzburg
SDSC
NEODC
CEH
RI
NCS
RLG
Innogen
NHS
Capri NTUAINRIAHUJUPCMax-
PlanckMIMAS
IASSIST
LDCACM
Data Archive
EDGGridPPEGEE
CambridgeLeicester
Jodrell Bank
DLI (US)DPC
DELOS
UNC
ESA
NASANARACNESESARLG
BNSC
TU Vienna UPenn
EBIMRC HGU
KyotoUSC
INRIA
GSK
Roslin
IBM Almaden
JHUCSIRO
CaltechJHU
CSIRO
CDSESO
OCLC
AHDSMicrosoft
IBMOracle
BTSTK
BADCBODC
ESO
IVOA
ResearchCouncils
HEIs&
FE
ResearchInstitutes
InternationalCollaborations
StandardsBodies
DPC
MIMAS
ILRT
Council forMuseums, Archives
& LibrariesRDN. OCLC
So’ton
OAI
NOF
NLA
NeSC
Development
14
• DCC: Why?
• DCC: What?
• DCC: Where?
• DCC: Who?
• DCC: How and When? – Progress
• How can we help – each other?
Development
15
Outreach
• User interviews and focus groups
• Internet Journal
• Web presence (http://www.dcc.ac.uk) and
Portal
• DPC membership and collaboration
• Associates Network
• DCC Conference (Sept 29-30)
• PV2005 Conference (Nov 21-23)
Development
16
Engage Communities of Practice
• with those who have responsibility• … to invoke/provoke good practices
– appraisal & retention/disposal– logical & physical integrity: authenticity/security
• place research in productive research domains– eg Informatics, Law School, e-Science ...
• work on the ‘R&D’, create services of relevance– achieve ‘virtuous circle’– turn products of research into tools for use
Development
17
Services
• Advisory service and Help desk
• Site visits and case studies
• Curation Manual and Briefings
• Tools and testbeds
• Standards watch
• Certification
• Training
Development
18
Development
• OAIS fundamentals• Registries/Repositories for Representation
Information – offering a repository of tools and technical
information, a focal point for digital curators– metadata standards
• Testbeds– for testing and evaluating tools, methods,
standards and policies in realistic settings• Certification
– standards
Development
19
OAIS Reference Model – Functional Model
4-1.
2
MANAGEMENT
Ingest
Data Management
SIP
AIPDIP
queries
result setsAccess
PRODUCER
CONSUMER
Descriptive Info
AIP
orders
Descriptive Info
Archival Storage
Administration
Preservation Planning
Development
22
Knowledge Based Persistent Archive
AttributesSemantics
Knowledge
Information
Data
Ingest Services
Management AccessServices
(Topic Maps / Buckets / Model-based Access)
(Data Handling System - SRB / FTP / HTTP)
MC
AT
/HD
F
Gri
ds
XM
L D
TD
SD
LIP
XT
M D
TD
Rul
es -
KQ
L
InformationRepository
Attribute- based Query
Feature-basedQuery
Knowledge orTopic-Based Query / Browse
KnowledgeRepository for Rules
RelationshipsBetweenConcepts
FieldsContainersFolders
Storage(Replicas,Persistent IDs)
Process Infrastructure Process
Development
23
Working with Others
• Digital Library Federation• The National Archives• Global Grid Forum• NARA• Library of Congress• Research Library Group• Digital Preservation Coalition• JISC community• E-Science Community• Associates Network• …and many many more
Development info – see
http://dev.dcc.rl.ac.uk
for details of Wiki and email list open to all
Development
24
Research
• To draw together the various functions of curation, from the traditional archival functions to the maintenance and publication of evolving knowledge as seen in scientific databases.
• To identify through direct research collaboration, and through interaction with the service arm of DCC, the key projects in which research is needed.
• To conduct research in areas already identified by the partners as crucial to digital curation.
• To institute two-way conduits between research and service in which practical issues can be drawn to the attention of researchers and the products of research can be tested in practice.
Development
25
Current research priorities
• Data integration and publication • Performance and optimisation • Annotation • Appraisal and long-term preservation • Socio-economic and legal context: rights,
responsibilities and viability • Cost-benefit analysis of the data curation process • Security: safe and effective data analysis
environments • Automation of metadata extraction • Visitors Programme and Seminar Series
Development
26
How can we help - each other?
• No one knows how to “do” curation properly
• There is an overlap between DCC and other JISC projects
• We can help each other