ukoln is supported by: developing e-infrastructure to support new research and learning paradigms....
Post on 28-Mar-2015
217 Views
Preview:
TRANSCRIPT
UKOLN is supported by:
Developing e-Infrastructure to support new research and learning paradigms.
Dr Liz Lyon, DirectorUKOLN, University of Bath, UK
Building the Info Grid, Copenhagen, September 2005.
www.bath.ac.uk
a centre of expertise in digital information management
www.ukoln.ac.uk
DEFF Seminar, Copenhagen, September 2005
2
Overview
1. e-Research: a changing landscape
2. Developing infrastructure: repository services & adding value• Aggregation and linking: eBank UK• Integration and workflows
3. Looking to the longer term: digital curation and preservation
1. e-Research: a changing landscape
DEFF Seminar, Copenhagen, September 2005
4
Data Overload!
How do we disseminate?
EPSRC National Crystallography
Service
eScience - the data deluge
DEFF Seminar, Copenhagen, September 2005
5
Diversity of data collections• Very large, relatively homogeneous: Large-scale Hadron
Collider (LHC) outputs from CERN• Smaller, heterogeneous and richer collections: World Data Centre for
Solar-terrestrial Physics CCLRC• Small-scale laboratory results: “jumping robots” project
at the University of Bath• Population survey data: UK Biobank
• Highly sensitive, personal data: patient care records
DEFF Seminar, Copenhagen, September 2005
6
Taxonomy of data collections• Research collections:
jumping robots • Community collections:
Flybase at Indiana (with UC Berkeley )
• Reference collections: Protein Data Bank
Source: NSF Long-Lived Digital Data Collections
Draft report revised May 2005
Evolution……
DEFF Seminar, Copenhagen, September 2005
7
Experience of data-sharing
• Large scale data sharing in the life sciences Draft Report June 2005 Sponsored by UK research funding bodies MRC, BBSRC, NERC, JISC, Wellcome
• Outcomes & recommendations– Importance of standards and good quality metadata– Require a data management plan– Work needed on vocabularies & ontologies– Awareness of archiving & long term preservation
• Position of research funders and policy makers?
DEFF Seminar, Copenhagen, September 2005
8
DEFF Seminar, Copenhagen, September 2005
9
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
The scholarly knowledge cycle.
Liz Lyon, Ariadne, July 2003.
This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0
© Liz Lyon (UKOLN, University of Bath), 2005
DEFF Seminar, Copenhagen, September 2005
10
A view in 2005
• Institutional repositories: country update CNI/JISC/SURF
• D-Lib Magazine September 2005• Emerging trends
– Germany 103, UK 31, Sweden 25– Policy: Germany YES, UK RCUK draft– National programmes: UK, Germany, Australia,
Sweden, Netherlands YES– Services: indexing, search, harvesting
2. Developing infrastructure: repository services & adding value
DEFF Seminar, Copenhagen, September 2005
12
Developing models• The e-Framework for Education & Research• JISC, UK and Department of Education, Science
& Training, Australia • www.e-framework.org
“The primary goal of the initiative is to produce an evolving and sustainable, open standards based service oriented technical framework to support the education and research communities.”
• Reference models• Service definitions
JISC-fundedcontent providers
institutionalcontent providers
externalcontent providers
brokers aggregators catalogues indexes
institutionalportals
subjectportals
learning managementsystems
media-specificportals
end-userdesktop/browser pr
esen
tatio
n
fusion
prov
isio
n
OpenURLlink servers
shared infrastructure
authentication/authorisation (Athens)
institutional profilingservices
terminology services
service registries
identifier services
metadata schema registries
© Andy Powell (UKOLN, University of Bath), 2005
This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0
JISC Information Environment architecture
DEFF Seminar, Copenhagen, September 2005
14
Learning & Teaching workflows
Research & e-Science workflows
Aggregator services:
eBank UK
Repositories : institutional, e-prints, subject, data, learning objects
Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Resource discovery, linking, embedding
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Resource discovery, linking, embedding
Deposit / self-archiving
Learning object creation, re-use
Searching , harvesting, embedding
Quality assurance bodies
Validation
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
DEFF Seminar, Copenhagen, September 2005
15
eBank UK Project
• Two key themes:– Open access to datasets– Linking research data to publications and to learning
• JISC-funded from September 2003: now in Phase 2 • UKOLN at the University of Bath (lead), University of
Southampton, University of Manchester• Exemplar: e-Science testbed ‘Combechem’
– Grid-enabled combinatorial chemistry / crystallography– National Crystallography Service
• Resource Discovery Network / PSIgate physical sciences portal
• http://www.ukoln.ac.uk/projects/ebank-uk/
DEFF Seminar, Copenhagen, September 2005
16
The “hybrid” project team
• UKOLN• Michael Day• Monica Duke• Rachel Heery• Traugott Koch • Liz Lyon• +• Andy Powell
• Southampton• Les Carr• Simon Coles• Jeremy Frey• Chris Gutteridge• Mike Hursthouse• Andrew Milstead
• Manchester• John Blunden-Ellis
DEFF Seminar, Copenhagen, September 2005
17
Data Flow in eBank UK
Submit
Store/link
Data files
Metadata
Present
HTML
Institutional repository eCrystals
OA
I-P
MH
Harvest (XML)
Index and Search
Present
HTML
eBank aggregator service
Create
Deposition Interface
Local archive search
interface
Service Provider interfaces e.g. Subject PortalDeposit
DEFF Seminar, Copenhagen, September 2005
18
CombeChem: An EPSRC pilot project
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
DEFF Seminar, Copenhagen, September 2005
19
Crystallography workflowRAW DATA DERIVED DATA RESULTS DATA
• Initialisation: mount new sample set up data collection• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic Information File)• Validation: chemical & crystallographic checks• Report: generate Crystal Structure Report
DEFF Seminar, Copenhagen, September 2005
20
DEFF Seminar, Copenhagen, September 2005
21
A data repository entry
DEFF Seminar, Copenhagen, September 2005
22
Access to the underlying data: complex objects
ecrystals.chem.soton.ac.uk
DEFF Seminar, Copenhagen, September 2005
23
Harvesting: OAIster
DEFF Seminar, Copenhagen, September 2005
24
Aggregating: search & discover
DEFF Seminar, Copenhagen, September 2005
25
Linking data to publications
DEFF Seminar, Copenhagen, September 2005
26
eBank embedded in a science portal
DEFF Seminar, Copenhagen, September 2005
27
Ontologies for discovery in an interdisciplinary world
• Transform the ‘list’ into an ‘ontology’
• Embed ontology into the deposition process
• Publish keywords in OAI
• Aggregators use keywords for linking with the broader literature
• Researchers use keyword ontology in search and discovery services
DEFF Seminar, Copenhagen, September 2005
28
Persistent identifiers for data citation
• eBank use cases: depositor, author, service provider, reader, publisher, ?
• Schemes: DOI, Handle, ARK, PURL• Global identification: express as http URIs• Added value services: CrossRef, resolution
service, integration (Globus), look-up service, ?• Degree of trust or persistence• Costs• Future potential: political, ?• Domain identifiers: International Chemical Identifier
(InChI) codes
DEFF Seminar, Copenhagen, September 2005
29
Publication & citation of scientific primary data project
• National Library for Science & Technology (TIB), University of Hanover, Germany
• STD-DOI Project http://www.std-doi.de • DOI registry for datasets• Data requirements: quality control, long-term curation,
use DOI resolver• Data publication agents: World Data Center Climate,
GeoForschungsZentrum Potsdam• Exemplar data citation:
– Kamm, H; Machon, L; Donner, S (2004): Gas chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p
DEFF Seminar, Copenhagen, September 2005
30
Integration into crystallographic publishing practices
Publishers seal of approval
DEFF Seminar, Copenhagen, September 2005
31
Integration into chemistry research workflows
• R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, registration of results
• SMART TEA electronic Laboratory notebook + annotations
• Related sub-domains of chemistry: SPECTRa Project (JISC-funded)• Research assessment (RAE) process?
DEFF Seminar, Copenhagen, September 2005
32
Integration into the curriculum and e-Learning workflows
• MChem course • Assess role in
Undergraduate Chemical Informatics courses
• Pedagogic evaluation• Introducing school
children to e-Research?
DEFF Seminar, Copenhagen, September 2005
33
Knowledge extraction & “post-processing”
New information & knowledge ………
• Mining (data, text, structures)
• Modelling (economic, climate, mathematical, bio)
• Analysis (statistical, lexical, pattern matching, gene)
• Presentation (visualisation, rendering)
• In federated repositories: Digital libraries, datasets, learning materials• Role of Google????
3. Looking to the longer term: digital curation & preservation
DEFF Seminar, Copenhagen, September 2005
35
For later use? In use now (and the future)?
Repositories and digital curation
Data preservation Data curation
Static Dynamic
“maintaining and adding value to a trusted body of digital information for current and future use”
DEFF Seminar, Copenhagen, September 2005
36
Assuring long term access to the research record• Trusted digital repositories
– Audit Checklist for Certification Draft Report– Research Libraries Group, August 2005– RLG-NARA Taskforce– Defined criteria under 4 categories
• Organisation
• Functions, processes & procedures
• Designated community & usability
• Technologies & technical infrastructure
• UK Digital Curation Centre http://www.dcc.ac.uk – 1st International DCC Conference Sep 29-30, Bath UK
DEFF Seminar, Copenhagen, September 2005
37
Thank you.Questions?…..
More information: UKOLN http://www.ukoln.ac.uk/
top related