keith g jeffery director, it research workflow process on the grids surface keith g jeffery...
TRANSCRIPT
Keith G Jeffery
Director, IT
Research Workflow Process on the
GRIDs Surface
Keith G Jeffery President euroCRIS
© Keith G Jeffery
Director, IT2Research Process Workflow GRIDs
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange Standards
• Workflow on the GRIDs surface
• Conclusion
© Keith G Jeffery
Director, IT3Research Process Workflow GRIDs
Nirvana
Commonly used to indicate an optimal state of a person (professional) or system (suitable)
• Buddhism. “The ineffable ultimate in which one has attained disinterested wisdom and compassion”
• Hinduism. “Emancipation from ignorance and the extinction of all attachment”
In a euroCRIS context, best possible CRIS system(s) for end-users backed by best advice
© Keith G Jeffery
Director, IT4Research Process Workflow GRIDs
Nirvana - Retrieval• An environment where an end-user can:
– Request information and through an intelligent dialogue generate a ‘job’ which provides it
• Example (Medical R&D planning)– How many researchers
• expert in GlycoProtein gp120 and CD4 molecule – are likely be available in 2015; – Classify researchers by country, institution;
• order list of researchers by number of refereed publications to date
© Keith G Jeffery
Director, IT5Research Process Workflow GRIDs
Nirvana – input / update
• An environment where an end-user can:
– Input / update information and through an intelligent dialogue obtain assistance where needed and validation of the input
• Example:
– if value input for ‘person’ then possible valid values for ‘organisational unit’ suggested
© Keith G Jeffery
Director, IT6Research Process Workflow GRIDs
The Solution is Required:• To overcome the ‘effort threshold’ to :
• obtain the required answers from the CRIS• input and update the information in the CRIS• maintain data quality in the CRIS
• Across – local stand-alone CRIS – heterogeneous distributed CRISs
• Thus achieving ‘nirvana’
© Keith G Jeffery
Director, IT7Research Process Workflow GRIDs
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange Standards
• Workflow on the GRIDs surface
• Conclusion
© Keith G Jeffery
Director, IT8Research Process Workflow GRIDs
The R&D Process: Recording
Workprogramme
Proposal
Project
Results
Exploitation
WealthCreation
CRISDATABASE
© Keith G Jeffery
Director, IT9Research Process Workflow GRIDs
The R&D Process: Feedbacks
Workprogramme
Proposal
Project
Results
Exploitation
WealthCreation
CRISDATABASE
© Keith G Jeffery
Director, IT10Research Process Workflow GRIDs
The R&D Process: Review
Workprogramme
Proposal
Project
Results
Exploitation
WealthCreationreview review review review
CRISDATABASE
© Keith G Jeffery
Director, IT11Research Process Workflow GRIDs
The WorkProgramme Process
Workprogramme
Economic factors
Societal factors
Technology Foresight
CRISDATABASE
-World / Country State-World / Country Models -Technology Prediction -Solicited Advice
© Keith G Jeffery
Director, IT12Research Process Workflow GRIDs
The Proposal Process
Proposal
Idea
Review Previous Work
Objectives
Method
Resources anddependencies
CRISDATABASE
-Previous Results -Previous Projects
CRISDATABASE
-Human Resources -Finance
© Keith G Jeffery
Director, IT13Research Process Workflow GRIDs
The Project Process
Project
Project ManagementSystem
CRISDATABASE
CRISDATABASE
-Previous Results -Previous Projects
-Human Resources -Finance
© Keith G Jeffery
Director, IT14Research Process Workflow GRIDs
The Results Process
Results
Initial Results
Internal Review
Peer Review
Publication orRegistration
CRISDATABASE
CRISDATABASE
Previous Results
© Keith G Jeffery
Director, IT15Research Process Workflow GRIDs
The Exploitation Process
Exploitation
Results
Business Plan
Finance
Production
Marketing
Selling
CRISDATABASE
Marketing InformationEconomic Information
© Keith G Jeffery
Director, IT16Research Process Workflow GRIDs
The Wealth Creation Process
Exploitation
WealthCreation
marketing
production
employment
CRISDATABASE
Marketing InformationEconomic Information
© Keith G Jeffery
Director, IT17Research Process Workflow GRIDs
The R&D Process: Recording
Workprogramme
Proposal
Project
Results
Exploitation
WealthCreation
CRISDATABASE
© Keith G Jeffery
Director, IT18Research Process Workflow GRIDs
The R&D ProcessRecording WorkProgramme
Workprogramme ProgrammeNameFundingOrgUnit
Person responsibleWorkprogramme document
CRISDATABASE
© Keith G Jeffery
Director, IT19Research Process Workflow GRIDs
The R&D ProcessRecording Proposal
Proposal
TitleAbstract
Person(s)OrgUnit(s)
Proposal Document
CRISDATABASE
© Keith G Jeffery
Director, IT20Research Process Workflow GRIDs
The R&D ProcessRecording Project
Project
TitleAbstract
Person(s)OrgUnit(s)
FundingProject Plan
CRISDATABASE
© Keith G Jeffery
Director, IT21Research Process Workflow GRIDs
The R&D ProcessRecording Results-Product
Results
Person(s)OrgUnit(s)Project(s)
Product(s)Product Description
CRISDATABASE
© Keith G Jeffery
Director, IT22Research Process Workflow GRIDs
The R&D ProcessRecording Results-Patent
Results
Person(s)OrgUnit(s)Project(s)Patent(s)
Patent File
CRISDATABASE
© Keith G Jeffery
Director, IT23Research Process Workflow GRIDs
The R&D ProcessRecording Results-Publication
Results
Person(s)OrgUnit(s)Project(s)
Bibliographic InformationArticle
CRISDATABASE
© Keith G Jeffery
Director, IT24Research Process Workflow GRIDs
The R&D ProcessRecording Exploitation
Exploitation
Person(s)OrgUnit(s)
Business planFinance Data
Marketing DataProduction Data
Sales Data
CRISDATABASE
© Keith G Jeffery
Director, IT25Research Process Workflow GRIDs
The R&D ProcessRecording Wealth Creation
WealthCreation
Person(s)OrgUnit(s)
Annual Reports/AccountsEmployment Records
Dividends Records
CRISDATABASE
© Keith G Jeffery
Director, IT26Research Process Workflow GRIDs
The R&D Process
Workprogramme
Proposal
Project
Results
Exploitation
WealthCreation
Note:
some CRIS developers limit recording of outputs from the process to areas indicated
Nir
van
a
© Keith G Jeffery
Director, IT27Research Process Workflow GRIDs
Complete Process ICT Support
• Nirvana is
– a complete,
– integrated,
– end-to-end ICT support
– for the research process
– across heterogeneous distributed CRISs
© Keith G Jeffery
Director, IT28Research Process Workflow GRIDs
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange Standards
• Workflow on the GRIDs surface
• Conclusion
© Keith G Jeffery
Director, IT29Research Process Workflow GRIDs
Metadata and Data Exchange Standards
• Metadata– a succinct representation of
the object of interest– Schema, navigational,
associative [descriptive, restrictive, supportive]
– Used for rapid retrieval of navigational data to objects of interest
– Can also be used for statistical purposes (‘how many…..’,’average number of…’)
data (document)
SCHEMA NAVIGATIONALASSOCIATIVE
how to
get it
constrain it
view to users
© Keith G Jeffery
Director, IT30Research Process Workflow GRIDs
Metadata• Many kinds and standards exist• Examples include:
– Publications: MARC, DC (Dublin Core)– Geospatial: CSDGM (Content standard
for digital geospatial metadata)– Engineering: STEP– Education: LOM (learning object
metadata); EDNA (Education Network Australia metadata)
© Keith G Jeffery
Director, IT31Research Process Workflow GRIDs
Metadata and CRISs• Commonly a CRIS stores the metadata rather
than the object itself– e.g. result_publicationId which can be used to
access the publication itself (person{author}, title, abstract etc usually stored in the CRIS)
– e.g. projectId which can be used to access the detailed project documentation (title, abstract etc usually stored in the CRIS)
© Keith G Jeffery
Director, IT32Research Process Workflow GRIDs
Metadata: DCf: Publications
UniqueIdPerson OrgUnit
Security
Privacy
AccessLevel
Charge
Restrictive
Annotation
Classification
Quality Assessment
OrgUnit
UniqueId
Domain of CERIF
PersonProject
ResourceIdentifier
Subject
Keywords
Description
Resource Type
Coverage Temporal
Coverage Spatial
TitleDescriptive
Navigational
© Keith G Jeffery
Director, IT33Research Process Workflow GRIDs
Metadata in CRISs• Used for
– Quality: validation on input / update– Summarising: overview results– Retrieval speed (find the list of objects
of potential interest)– Controlling access– Rights management– And……..
© Keith G Jeffery
Director, IT34Research Process Workflow GRIDs
Metadata in Interoperating CRISs
• Metadata essential to allow interoperation of CRISs, especially heterogeneous distributed CRISs
• Provides the information necessary to set up automatically retrieval (or update) over heterogeneous CRISs– Catalog technique– Universal schema technique(s)– Knowledge-based reconciliation technique(s)
© Keith G Jeffery
Director, IT35Research Process Workflow GRIDs
Metadata and Data Exchange Standards• Data Exchange Standards
– Needed not just for data (file) exchange– Also for returning results of a retrieval from
one CRIS to another in a form (syntax, semantics) that is processable• Metadata plus dataset
– Note data exchange standards used extensively in e-business, banking, insurance, medical, engineering, research areas
© Keith G Jeffery
Director, IT36Research Process Workflow GRIDs
The Key: Metadata and Data Exchange Standards
• Nirvana is– Formal metadata (machine
understandable)– Query: Metadata describing CRIS
resources to improve queries– Answer: Metadata attached to Query
result files (data exchange) so the receiving CRIS or user can understand the output
© Keith G Jeffery
Director, IT37Research Process Workflow GRIDs
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange Standards
• Workflow on the GRIDs surface
• Conclusion
© Keith G Jeffery
Director, IT38Research Process Workflow GRIDs
Workflow on the GRIDs surface
• GRIDs ‘surface’ provides
– Computational capabilities of GRID
– Information presentation capabilities of WWW
– Information management capabilities
• But not yet environment for workflow
© Keith G Jeffery
Director, IT39Research Process Workflow GRIDs
The GRIDs Architecture
Knowledge Layer
Information Layer
Computation / Data LayerDat
a to
Kno
wle
dge
Control
© Keith G Jeffery
Director, IT40Research Process Workflow GRIDs
The GRIDs ArchitectureD
ata
to K
now
ledg
eC
ontrol
Par
ticl
e P
hysi
cs A
ppli
cati
on
Gen
omic
s A
ppli
cati
on
Env
iron
men
tal A
ppli
cati
on
E-B
usin
ess
App
lica
tion
© Keith G Jeffery
Director, IT41Research Process Workflow GRIDs
A POSSIBLE ARCHITECTURE
U:USER
S:SOURCE R:RESOURCE
Rm:ResourceMetadata
Ra:ResourceAgent
Ua:User Agent
Um:User Metadata
Sm:SourceMetadata
Sa:Source Agent brokers
The GRIDs Environment
© Keith G Jeffery
Director, IT42Research Process Workflow GRIDs
A Brief History of GRIDs• 1G: custom-made architecture machines to user
– Pioneering metacomputing• 2G: proprietary standards and interfaces
– I-WAY GLOBUS, UNICORE, CONDOR, LEGION AVAKI
• 2.5G: added in FTP, SRB, LDAP, AccessGRID• 3G: adopted W3C concepts for open interfaces
– OGSA / OGSI: note especially OGSA/DAI– But built on 2.G foundations
e-ScienceApps
e-ScienceR&D
© Keith G Jeffery
Director, IT43Research Process Workflow GRIDs
But…..• This comes nowhere near the
requirements as originally defined for GRIDs
• Too low-level (programmer not end-user level)– Insufficient representativity– Insufficient expressivity– Insufficient resilience– Insufficient dynamic flexibility
© Keith G Jeffery
Director, IT44Research Process Workflow GRIDs
So…..
• The US GRID is metacomputing plus extensions
– In 2002 improved with OGSA using W3C Web Services ideas
• European position is that GRID architecture (GLOBUS or even UNICORE) is the wrong starting point for the European vision
© Keith G Jeffery
Director, IT45Research Process Workflow GRIDs
And…..• EC persuaded of importance of GRIDs
– Started in IST/Environment (early 2000) with IT architectural framework for FP6 projects
– Set up GRID Unit under Wolfgang Boch (late 2002)
• January 2003: large workshop (GRID Unit)– (~ 240 participants)– Keynotes:
• Thierry Priol (INRIA, FR) • Domenico Laforenza (CNR, IT) • Keith Jeffery (CCLRC, UK)
© Keith G Jeffery
Director, IT46Research Process Workflow GRIDs
NGG Requirements• Transparent and reliable• Open to wide user and provider communities• Pervasive and ubiquitous• Secure and provide trust across multiple
administrative domains• Easy to use and to program• Persistent• Based on standards for software and protocols• Person-centric• Scalable• Easy to configure and manage
2.5G or even 3G GRID
basically meet none
of these
WWW meets some
of these
© Keith G Jeffery
Director, IT47Research Process Workflow GRIDs
NGG• NGG1: 200301-200306
– Brought together visionary experts– Defined properties required and research agenda to
achieve them
• NGG2: 200401-200407 – Updated NGG1 vision in the light of funded projects
and evolving requirements and technology
• NGG3 200509-
• http://www.cordis.lu/ist/grids/pub-report.htm
© Keith G Jeffery
Director, IT48Research Process Workflow GRIDs
GRIDs Vision and Requirements (1)
• a user interacts with the GRIDs environment intelligently
• such that the GRIDs environment proposes a 'deal' to the end-user to satisfy her request
• which the user can then decide to execute - involving multiple resources of computation, information, detectors (for new data collection), interactions with other users through various communication devices etc.
© Keith G Jeffery
Director, IT49Research Process Workflow GRIDs
GRIDs Vision and Requirements (2)
• interoperation as a seemingly homogeneous 'surface' over a range of devices from smart dust through detectors to embedded systems (including controllers), handhelds, laptops, desktops, departmental servers, corporate servers and supercomputers.
• the 'surface' depends on self-* (self-managing, self-repairing, self-tuning...) capability across arbitrary and dynamic collections of (large numbers of) nodes to give scalability, performance, reliability, access, security, privacy and other features.
© Keith G Jeffery
Director, IT50Research Process Workflow GRIDs
NGG1• NGG1 Properties Required:
– Transparent and reliable– Open to wide user and provider communities– Pervasive and ubiquitous– Secure and provide trust across multiple
administrative domains– Easy to use and to program– Persistent– Based on standards for software and protocols– Person-centric– Scalable– Easy to configure and manage
© Keith G Jeffery
Director, IT51Research Process Workflow GRIDs
Call2 (NGG1) Projects Funded
inteliGRIDSemantic Grid
based virtual organisations
ProvenanceProvenance for Grids
DataminingGridDatamining
tools & services
UniGridSExtended OGSA
Implementation based on UNICORE
K-WF GridKnowledge based
workflow & collaboration
GRIDCOORDBuilding the ERAin Grid research
European - wide virtual laboratory for longer term Grid research - creating the foundation for the next generation Grids
COREGRID
EU - driven Grid services architecture for business
and industryNEXTGRID
Mobile Grid architecture and services for dynamic
virtual OrganisationsAKOGRIMO
Grid-based generic enablingapplication technologies to
facilitate solution of industrialproblemsSIMDAT
OntoGridKnowledge Services for
the semantic Grid
HPC4UFault tolerance,dependability
for Grid
Figure 1: The Call 2 Projects as a ‘house’
© Keith G Jeffery
Director, IT52Research Process Workflow GRIDs
NGG2 SWOT(1)• Ontologies and semantic web technologies will be crucial
to provide scalable support for complex, heterogeneous Grids middleware and applications.
• The strengths of the European telecommunications industry and the diversity of its market for electronic control systems have given Europe a leading position in the areas of mobile and embedded technology. This is of particular relevance for the realization of the vision of a Grid as a pervasive, user-centered utility.
• The weakness in hardware and primary software products (e.g. commodity processors, server and desktop Operating systems, Programming Languages, etc.) may hamper the development of a European leadership in Grids Technologies.
© Keith G Jeffery
Director, IT53Research Process Workflow GRIDs
NGG2 SWOT(2)• The convergence between Grids and Web Services provides a
significant opportunity to move to a model of software development and service provision where the market dominance of particular OS vendors is no longer a major economic issue.
• The distinctive European vision of a Grids environment that operates from the level of devices to supercomputers, to serve communities ranging from individuals to whole industries, including data, information and knowledge and emphasizing resilience and scalability could have a significant economic and social impact far beyond the scope of existing compute and data Grids. This should be contrasted with the North American Grid vision of programmer-level metacomputing.
• It is vital that any European vision for the evolution of Grids is accompanied by a clear representation of that vision to the key standards bodies and technology providers worldwide.
© Keith G Jeffery
Director, IT54Research Process Workflow GRIDs
NGG2 Recommendations• (a) development of a design for a new operating system that provides a
fault-tolerant, scalable, self-healing, self-managing environment upon which Grids service middleware may ‘sit’;
• (b) development of Grids foundations middleware suitable both for enhancing existing operating systems and for inclusion within (a);
• (c) development of Grids service middleware in a modular fashion allowing applications to utilise those services they require;
• (d) research and development in computer science and information technology required to accomplish (c), (b) and (a), notably new models and software for transactions and messaging; for scheduling, resource management and optimisation; for trust, security and privacy; for data, information and knowledge management; for software development and deployment including mobile code; and for intelligent and appropriate user interfaces and device interfaces;
• (e) development of novel applications that are wealth-creating or improve the quality of life, particularly in the e-business domain, but also in e-health, e-environment, e-culture, e-science, e-government;
© Keith G Jeffery
Director, IT55Research Process Workflow GRIDs
NGG2Application A Application B Application C
Grids Middleware Services Needed for A
Grids Middleware Services Needed for B
Grids Middleware Services Needed for C
Grids Foundations for Operating System X
Grids Foundations For Operating System Y
Operating System X
Operating System Y
Grids Operating System(including Foundations)Modular and dynamically loadable
© Keith G Jeffery
Director, IT56Research Process Workflow GRIDs
Workflow on the GRIDs Surface
• Nirvana is
– GRIDs ‘surface’
• Providing computation, information presentation and information management
– Plus Self* resilience
– Plus capabilities to support workflow
© Keith G Jeffery
Director, IT57Research Process Workflow GRIDs
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange Standards
• Workflow on the GRIDs surface
• Conclusion
© Keith G Jeffery
Director, IT58Research Process Workflow GRIDs
Overall : The Way Forward
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge
CRIS
Management of Research
© Keith G Jeffery
Director, IT59Research Process Workflow GRIDs
PUBLICATIONS
Data
Information
Knowledge
Overall : The Way Forward
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
CRIS
Management of ResearchCDR
(CERIF)
Portal with knowledge-assisted user interface
© Keith G Jeffery
Director, IT60Research Process Workflow GRIDs
Overall : The Way Forward
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
Portal with knowledge-assisted user interface
© Keith G Jeffery
Director, IT61Research Process Workflow GRIDs
Overall : The Way Forward
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
publish
validate
Portal with knowledge-assisted user interface
© Keith G Jeffery
Director, IT62Research Process Workflow GRIDs
Overall : The Way Forward
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
publish
validate
GRIDs
Portal with knowledge-assisted user interface
Ambient, Pervasive Access
© Keith G Jeffery
Director, IT63Research Process Workflow GRIDs
Overall : The Way Forward
Portal with knowledge-assisted user interface
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
publish
validate
GRIDs
Ambient, Pervasive Access
© Keith G Jeffery
Director, IT64Research Process Workflow GRIDs
Overall : The Way Forward
Portal with knowledge-assisted user interface
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
publish
validate
GRIDs
Ambient, Pervasive Access
© Keith G Jeffery
Director, IT65Research Process Workflow GRIDs
Overall : The Way Forward
Portal with knowledge-assisted user interface
Digital Curation Facility
SCIENTIFIC DATASETS
Data
Information
Knowledge
PUBLICATIONS
Data
Information
Knowledge metadata
publish
validate
GRIDs
Ambient, Pervasive Access
© Keith G Jeffery
Director, IT66Research Process Workflow GRIDs
Three Steps to Nirvana
Complete Process ICT Support
Metadata and Data Exchange Standards
Workflow on the GRIDs Surface
The Perfect CRIS
Keith G Jeffery
Director, IT
Prof. Keith G Jeffery
Director, Information TechnologyHead, Business & Information Technology Department
CCLRC Rutherford Appleton Laboratory
http://www.bitd.clrc.ac.uk/