open science grid: project statement & vision
DESCRIPTION
Open Science Grid: Project Statement & Vision. VO Jobs Running on OSG in 2006. - PowerPoint PPT PresentationTRANSCRIPT
1
Open Science Grid: Project Statement & Vision
Transform compute and data intensive science through a cross-domain self-managed national distributed cyberinfrastructure that brings together campus and community infrastructure facilitating the research of Virtual Organizations at all scales.
VO Jobs Running on OSG in 2006
2
Why the effort is important:
Sustained growth in the needs of traditional compute and data intensive science;
The steady stream of scientific domains that add and expand the role of computing and data processing in their discovery process; Coupled with the administrative and physical distribution of compute and storage resources and increase in the size, diversity and scope of scientific collaborations.
0
10
20
30
40
50
60
70
80
90
100
2006 2007 2008 2009
OTHERS
STAR
LIGO
CMS
ATLAS
0
10
20
30
40
50
60
70
80
90
100
2006 2007 2008 2009
OTHERS
STAR
LIGO
CMS
ATLAS
CPU Million Specint2000sCPU Million Specint2000s
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
18.00
20.00
2006 2007 2008 2009
OTHERS
STAR
LIGO
CMS
ATLAS
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
18.00
20.00
2006 2007 2008 2009
OTHERS
STAR
LIGO
CMS
ATLAS
Cache Disk in PetabytesCache Disk in Petabytes
Facility (preliminary commitments):
3
Goals of the OSG:Support data storage, distribution & computation for High Energy, Nuclear &
Astro Physics collaborations, in particular delivering to the schedule, capacity and capability needed for LHC and LIGO science.
Engage and benefit other Research & Science of all scales through progressively supporting their applications.
Educate & train students, administrators & educators.
Operate & evolve a petascale Distributed Facility across the US providing guaranteed & opportunistic access to shared compute & storage resources.
Interface & Federate with Campus, Regional, other national & international Grids (including EGEE & TeraGrid).
Provide an Integrated, Robust Software Stack for Facility & Applications, tested on a well provisioned at-scale validation facility.
Evolve the capabilities offered by the Facility by deploying externally developed new services & technologies.
4
Challenges: Sociological and TechnicalDevelop the organizational and management structure of an open
consortium that drives such a CI.
Develop the organizational and management structure for the project that builds, operates and evolves such CI.
Maintain and evolve a software stack capable of offering powerful and dependable capabilities to the NSF and DOE scientific communities.
Operate and evolve a dependable facility. Boston University Brookhaven National LaboratoryCalifornia Institute of Techology
Columbia University
Cornell University Fermi National Accelerator LaboratoryIndiana University Lawrence Berkeley National LaboratoryRennaisance Computing Institute
Stanford Linear Accelerator Center
University of California, San Diego
University of Chicago/Argonne National Laboratory
University of FloridaUniversity of IowaUniversity of Wisconsin, Madison
http://www.opensciencegrid.org
5
Software Stack Software Release Process
Grid of Grids: From Local to Global
Computational Science: Here, There and EverywhereGlobal Research & Shared
Resources
6
Integrated Network Management
Timeline & Milestones (preliminary)
LHC Simulations Support 1000 Users; 20PB Data Archive
Contribute to Worldwide LHC Computing Grid LHC Event Data Distribution and Analysis
Contribute to LIGO Workflow and Data Analysis
+1 Community Additional Science Communities +1 Community+1 Community +1 Community
Facility Security : Risk Assessment, Audits, Incident Response, Management, Operations, Technical Controls
Plan V1 1st Audit Risk Assessment
Audit Risk Assessment
Audit Risk Assessment
Audit Risk Assessment
VDT and OSG Software Releases: Major Release every 6 months; Minor Updates as needed VDT 1.4.0 VDT 1.4.1 VDT 1.4.2 … … … …
Advanced LIGO LIGO Data Grid dependent on OSG
CDF Simulation
STAR, CDF, D0, Astrophysics
D0 Reprocessing
STAR Data Distribution and Jobs 10KJobs per Day
D0 SimulationsCDF Simulation and Analysis
LIGO data run SC5
Facility Operations and Metrics: Increase robustness and scale; Operational Metrics defined and validated each year.
Interoperate and Federate with Campus and Regional Grids
2006 2007 2008 2009 2010 2011
Project start
End of Phase I End of Phase II
VDT Incremental Updates
dCache with role based authorizatio
n
OSG 0.6.0 OSG 0.8.0 OSG 1.0 OSG 2.0 OSG 3.0 …
Accounting Auditing
VDS with SRMCommon S/w Distribution
with TeraGridEGEE using VDT 1.4.X
Transparent data and job movement with TeraGrid
Transparent data management with EGEE
Federated monitoring and information services
Data Analysis (batch and interactive)
Workflow
Extended Capabilities & Increase Scalability and Performance for Jobs and Data to meet Stakeholder needsSRM/dCache Extensions
“Just in Time” Workload Management
VO Services Infrastruct
ureImproved Workflow and Resource SelectionWork with SciDAC-2 CEDS and Security with Open Science
+1 Community
2006 2007 2008 2009 2010 2011
+1 Community +1 Community +1 Community+1 Community