an integrated computing and data environment for ... · an integrated computing and data...
TRANSCRIPT
![Page 1: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/1.jpg)
ECMWF 2004CCLRC e-Science Centre
An Integrated Computing and Data Environment for Environmental Science
Kerstin Kleese van DamLisa Blanshard, Rik Tyer
![Page 2: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/2.jpg)
ECMWF 2004CCLRC e-Science Centre
Radioactive waste disposal
Crystal growth and scale inhibition
Pollution: molecules and atoms on mineral surfaces
Crystal dissolution and weathering
Science Drivers
![Page 3: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/3.jpg)
ECMWF 2004CCLRC e-Science Centre
Royal Institution
University ofReading
CCLRC Daresbury
eMinerals Partners
![Page 4: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/4.jpg)
ECMWF 2004CCLRC e-Science CentreeMinerals Team
11 Principle Investigators
12 PDRAs
Many other direct and indirect Collaborators
![Page 5: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/5.jpg)
ECMWF 2004CCLRC e-Science CentreResources
16 Node Linux Cluster
40 Node Linux Cluster
25 Node Condor Cluster
16 Node Linux Cluster
910 Node Condor Pool
University ofReading
24 Node IBM Cluster
16 Node Linux Cluster
CCLRC Daresbury
HPCx
4 Node IBM Database System
+ National Grid Service at Manchester, Leeds, Oxford and
CCLRC
![Page 6: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/6.jpg)
ECMWF 2004CCLRC e-Science CentreChallenges
10 Different sites and administrations – user names, passwords, batch systems
13 Different Computers with varying operating systems, compilers, file systems, licenses
Question:
How to enable scientists to use these resources to their full extend, without spending their days locked in administration?
![Page 7: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/7.jpg)
ECMWF 2004CCLRC e-Science CentreSolution
Single Sign On – to all resources – computing, data and application - > x509 certificates for authentication + separate authorisation certificates
One Job Submission Interface – to all compute facilities –> Condor + Globus V2
One File System – on all facilities – computing and data -> Storage Resource Broker (SDSC + CCLRC)
Metadata Capture for all activities – CML + CCLRC Scientific Metadata Model -> Metadata Editor
One Stop Data Access – to all data –> CCLRC DataPortal Software
![Page 8: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/8.jpg)
ECMWF 2004CCLRC e-Science CentreCompute Grids
Beowulf Clusters
Globus Toolkit 2
SMP Machines Condor Pools
• Sharing of resources using Globus Toolkit 2• Common security infrastructure
• Common access mechanisms
• Degree of abstraction from underlying system
• Aggregation of resources using Condor• Can build significant resources for HTPC out of existing infrastructure
![Page 9: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/9.jpg)
ECMWF 2004CCLRC e-Science CentreData Management
• Distributed file system using SRB• Files can be organised logically regardless of physical location and storage media
• Facilitates sharing of data files within VO and to collaborators
• Data files / executables are immediately available to compute resources
![Page 10: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/10.jpg)
ECMWF 2004CCLRC e-Science CentreeMinerals Minigrid
![Page 11: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/11.jpg)
ECMWF 2004CCLRC e-Science CentreInterface
Scientists are able to:• Put their input files into their SRB Directory• Choose a suitable application executable in SRB• Use the Condor DAGMan to define
workflow/dependencies for calculation allowing for parameter sweeps, ensemble runs and linked execution
• Choose suitable resource type• Submit DAGMan Script using their e-Science
Certificate• Review results in SRB
![Page 12: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/12.jpg)
ECMWF 2004CCLRC e-Science Centre
![Page 13: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/13.jpg)
ECMWF 2004CCLRC e-Science Centre
![Page 14: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/14.jpg)
ECMWF 2004CCLRC e-Science Centre
The CCLRC DataPortal
DataPortal – One stop shop to search for and access data from different organisations on heterogeneous systems in a uniform way. Allows parallel querying of various resources, offers personal permanent workspace to work with the data. The system is based on a web services architecture, connects well with other services and offers a high level of security.
http://www.e-science.clrc.ac.uk/web/projects/dataportal
![Page 15: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/15.jpg)
ECMWF 2004CCLRC e-Science Centre
![Page 16: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/16.jpg)
ECMWF 2004CCLRC e-Science Centre
![Page 17: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/17.jpg)
ECMWF 2004CCLRC e-Science Centre
![Page 18: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/18.jpg)
ECMWF 2004CCLRC e-Science Centre
Discovery
Annotation
Result Storage
Publish Results
Discovery
Analysis
Results
Full Circle
CCLRC DataPortal
CCLRC Metadata Format
SDSC SRB
Condor
Minigrid Compute Resources
CCLRC Metadata Editor
SDSC SRB
Metadata Database
![Page 19: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/19.jpg)
ECMWF 2004CCLRC e-Science CentreFuture
Automation of Metadata Capturing ProcessesLinkage to e-Publication Better Search InterfacesVirtual Dataset Generation + Annotation FacilitiesAssimilation and Mining of Data from variable
Sources
![Page 20: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/20.jpg)
ECMWF 2004CCLRC e-Science CentreSummary
• Have production minigrid infrastructure comprising data, metadata, HPC and HTPC resources
• Minigrid infrastructure has enabled real science research
• Working on further integration of different areas of functionality within minigrid
![Page 21: An Integrated Computing and Data Environment for ... · An Integrated Computing and Data Environment for Environmental Science Kerstin Kleese van Dam Lisa Blanshard, Rik Tyer. ECMWF](https://reader031.vdocuments.mx/reader031/viewer/2022041111/5f1353b527ef044b6a71c56d/html5/thumbnails/21.jpg)
ECMWF 2004CCLRC e-Science Centre
Thank you for you attention.
Any questions??
Contact details
http://www.e-science.clrc.ac.uk