1 m. biasotto – infn t1 + t2 cloud workshop, bologna, november 21 2006 t2 storage issues m....
TRANSCRIPT
1M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
T2 storage issues
M. Biasotto – INFN Legnaro
2M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
T2 issues
Storage management is the main issue for a T2 site
CPU and network management are easier
• years of experience
• stable tools (batch systems, installation, ...)
• total number of machines for average T2 is small: ~XX
Several different issues in storage
• hardware: which kind of architecture and technology?
• hw configuration and optimization
• storage cpu network
• storage resource managers
3M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Hardware
Which kind of hardware for T2 storage?
• SAN based on SATA/FC disk-arrays and controllers
● flexibility and reliability
• DAS (Direct Attached Storage) Servers
● cheap and good performances
• Others?
● iSCSI, AoE (ATA over Ethernet), ....
There are already working groups dedicated to this (technolgy tracking, tests, etc.), but information is a bit dispersed
Important, but not really critical?
• once you have bought some disks, you are stuck with them for years, but mixing different types usually is not a problem.
4M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
• Current status of italian T2s
Site Hardware Storage Manager
TeraBytes
Bari DAS dCache 10
Catania SATA/FC DPM 19
Frascati SATA/FC DPM 6
Legnaro SATA/FC DPM 17
Milano SATA/FC DPM 3
Napoli SATA/SCSI DPM 5
Pisa SATA/FC dCache
Roma DPM
Torino SATA/FC
5M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storage configuration Optimal storage configuration is not easy, a lot of factors to take in
consideration
• how many TB per server?
• which RAID configuration?
• fine tuning of parameters: in disk-arrays, controllers and servers (cache, block sizes, buffer sizes, kernel params, ... a long list)
Disk-pools architecture: is one large pool enough, or do we need to split?
• buffer pools (WAN transfer buffer, local WN buffer)?
• different pools for different activities (production pool, analysis pool)?
Network configuration: avoid bottlenecks between servers and CPU
Optimal configuration depends strongly on the application
• 2 main (very different) types of access: remote I/O from WN or local copy to/from WN. Currently remote I/O for CMS and local for Atlas.
• production and analysis activities have different access pattern
6M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storage configuration
Optimal configuration varies depending on many factors: there is no one simple solution: every site will have to fine tune its own storage
But having some guidelines would be useful
• leverage on current experience (mostly at T1)
Can have huge effects on performances, but it’s not so critical
• many of these can be easily changed and adjusted
7M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storage Resource Manager
Which Storage Resource Manager for a T2?
• DPM, dCache, Storm
• Xrootd protocol required by Alice (dove lo metto questo?)
The choice of a SRM is a more critical issue: it’s much more difficult to change
• adopting one and learning how to use it is a large investment: know-how in deployment, configuration, optimization, problem finding and solving, ...
• obvious practical problems if a site has a lot of data already stored
First half of 2007 last chance for a final decision?
• of course nothing is ever ‘final’, but after that a transition would be much more problematic
8M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Requirements Performance & scalability
• how much is needed for a T2?
WAN bandwith ~ 100 MB/s
LAN bandwith > 300 MB/s ??
Disk ~ 500 TB
Concurrent access > 300 ??
Reliability & stability
Advanced features
• data replication, internal monitoring, xxx, xxx Cost? (in term of human and hardware resources)
9M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
dCache dCache is currently the most mature product
• used in production since a few years
• deployed at several large sites: T1 FNAL, T1 FZK, T1 IN2P3, all US-CMS T2s, T2 Desy, ...
There is no doubt it will satisfy the performance and scalability needs of a T2
Two key features to guarantee performance and scalability:
Services can be split among different nodes
• all ‘access doors’ (gridftp, srm, dcap) can be replicated
• also ‘central services’ (which usually run all on the admin node) can be distributed
“Access queues” to manage high number of concurrent accesses
• storage access requests are queued and can be distributed, prioritized, limited based on protocol type or access type (read/write)
• buffer for temporary high load, avoid server overloading
10M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
dCache
A lot of advanced features
• data replication (for 'hot' datasets)
• pool match-making dynamic and highly configurable
• pool draining for schdeuled maintenance operations
• grouping and partitioning of pools
• internal monitoring and statistics tool
11M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
dCache issues
Services are heavy and not much efficient
• written in java, require a lot of RAM and CPU
• central services can be split, the problem is: do they need to be split? Even in a T2 site? Having to manage several dCache admin nodes could be a problem
More costly in term of human resources needed
• more difficult to install, not integrated in LCG distribution
• steeper learning curve, documentation needs to be improved
It’s more complex, with more advanced features, and this obviously comes at a cost
• does a T2 need the added complexity and features, can they be afforded?
still missing VOMS support and SRM v2, but should both be available soon (dove e’ meglio metterla questa?)
12M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
INFN dCache experience
Used in production at Bari since May 2005, building up a lot of experience and know-how
Overall: good stability and perfomance
• grafici Bari
13M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
INFN dCache experience
Performance test at CNAF in ??? 2005 (o era 2006?)
• ???? demonstrated
• grafici
14M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
INFN dCache experience
Pisa experience: from DPM to dCache (o forse va messo in fondo a DPM, dove si parla dei problemi con CMS che sono stati la cause del passaggio)
15M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storm Developed in collaboration between INFN-CNAF and ICTP-EGRID
(Trieste)
Designed for disk-based storage: implements a SRM v2 interface on top of an underlying parallel or cluster file-system (GPFS, Lustre, etc.)
Storm takes advantage of the aggregation functionalities of the underlying file-system to provide performance, scalability, load balancing, fault tolerance, ...
• not bound to a specific file-system: in principle allows to exploit the very high research and development activity in the clustering file-systems field
support of SRM v2 functionalities (space reservation, lifetime, file pinning, pre-allocation, ...) and ACL
Full VOMS support
So far Storm has been penalized by the fact that it supported only SRM v2, while LCG is still running with SRM v1
• no site could deploy it in production
16M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storm
Scalability
• Storm servers can be replicated
• centralized database: currently MySql, possible others (Oracle) in future releases
Advanced fetaures provided by the underlined file-system
• GPFS: data replication, pool vacation
17M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Storm issues
Not used anywhere in production so far, and few test installations at external sites
It’s likely that a first “field test” would result in a lot of small issues and problems (shouldn’t be a concern in the longer term)
Installation and configuration not easy
• but mostly due to too few deployment tests
• recent integration with yaim should bring improvements in this area
No access queue for concurrent access management (and avoid server overloading)
No internal monitoring
There could be compatibility issues between the underlying cluster file-system and some VO applications
• some file-systems have specific requirements on kernel version
18M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
INFN Storm experience
Obvioulsy CNAF has all the needed know-how on Storm
Also GPFS experience within INFN, mostly at CNAF but not only (Catania, Trieste, Genova, ...)
• overall good in term of performance, scalability and reliability
Permormance test at CNAF in xxx 2005 (?) Storm + GPFS testbed
• grafici e result (vedi slides di A.Forti ad Otranto)
Storm installations for deployment and functionality tests
• Padova(?)
• Legnaro (GridCC)
• altri?
19M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
DPM
DPM is the SRM system supported by LCG, distributed with LCG middleware
Yaim support: easy installation
Possible migration from old classic SE
It’s the natural choice for a LCG site that needs SRM and doesn’t have (pose) too many concerns
a lot of DPM installations around
....
VOMS support
SRM v2 implementation (but still limited functionalities)
20M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
DPM issues Still lacking many functionalities (some of them important)
• load balancing very simple (round robin among file-systems in pool) and not configurable
• data replication still buggy in current release
• pool draining for server maintenance or dismission
• pool selection based on path
• internal monitoring
• support for multi-groups pools
Scalability limits?
• no problem for rfio and gridftp services: easily distributed on pool servers
• but ‘central services’ on head node? In principle ‘dpm’ ‘dpns’ and mysql services can be split: not tested yet (will it be necessary? will it be enough?)
• no ‘access queue’ like in dCache to manage concurrent access
DPM/Castor/rfio compatibility issue (dove metto questa?)
21M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
INFN DPM experience
Used in production at many INFN sites
• no major issues or complains, good overall stability
• but never really stressed
citare DPM+GPFS a Bologna ?
DPM @Legnaro
• stability and reliability: CMS LoadTest
• performance: MC production
• but even in CSA06 system not stressed enough: so far no evidence of problems or limitations, but performance values reached are still low
Pisa experience (qui o in dCache?)
22M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Summary
dCache
• mature product, meets all performance and scalability requirements
• more costly in term of hw and human resources
DPM
• important features still missing, but this is not a concern in the longer term (no reason why they shouldn’t be added)
• required performance and scalability not proven yet: are there some intrinsic limits?
Storm
• potentially interesting, but must be tried in production
• required performance and scalability not proven yet: are there some intrinsic limits?
23M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Conclusions
24M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Aknowledgments
Acknowledgments
25M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006
Varie da aggiungere da qualche parte
In CMS SC4 and CSA06 the vast majority (almost all) of problems and job failures were related to storage issues
• bugs, hw failures, interoperability problems, misconfigurations, ...