castor evolution presentation to hepix 2003, vancouver 20/10/2003 jean-damien durand, cern-it
TRANSCRIPT
CASTOR evolution
Presentation to HEPiX 2003, Vancouver20/10/2003
Jean-Damien Durand, CERN-IT
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 2
Outline
• CASTOR status• Current developments• Conclusions
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 3
CASTOR status
• Usage at CERN– ~1.6PB data, ~12 million files
Drop due to deletion of 0.75Million ALICE MDC4 files (1GB each)
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 4
Plan for current developments
• The vision• Some problems with today’s system• Proposal
– Ideas– Architecture– Request registration and scheduling– Catalogues– Disk server access and physical file ownership– Some interesting features
• Project planning and progress monitoring
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 5
Vision...
• With clusters of 100s of disk and tape servers, the automated storage management faces more and more the same problems as CPU clusters management– (Storage) Resource management– (Storage) Resource sharing– (Storage) Request scheduling– Configuration– Monitoring
• The stager is the main gateway to all resources managed by CASTOR
Vision: Storage Resource Sharing Facility
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 6
PROPOSAL
ProposalRequest handling & scheduling
RequestRegister
Fabric Authentication servicee.g. Kerberos-V server
Read: /castor/cern.ch/user/c/castor/TastyTreesDN=castor
Typical file request
Thread pool
Authenticate “castor”
Request repository(Oracle, MySQL)
Scheduler
SchedulingPolicies
user “castor”has priority
Dispatcher
Store request
Run request on pub003d
Get Jobs
Dis
k s
erv
er lo
ad
Catalogue
File staged?
Request registration:Must keep up with highrequest rate peaks
Request scheduling:Must keep up with averagerequest rates
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 7
PROPOSAL
ProposalDisk server access
• Today a user can access files on disk servers either by– The CASTOR file name /castor/cern.ch/...– The physical file name /shift/lhcb003d/...
• With the new stager we restrict– To only allow for access by CASTOR file name– All physical files are owned by a generic
account (stage,st) and their paths are hidden from direct RFIO access
WHY????
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 8
PROPOSAL
ProposalDisk server access
• Avoid two databases for file permissions & ownership– CASTOR name server– File system holding physical file
• Facilitate migration/recall of user files– Files with different owners are normally grouped
together on tapes owned by a generic account (stage,st)– Would like to avoid setuid/setgid for every file
• Avoid backdoors: all disk server access must be scheduled
An useful analogy: forbid interactive login access to the batch nodes in a LSF cluster
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 9
Project planning and monitoring
• Detailed plan in proposal document– http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW
• Three milestones:– October 2003: Demonstrate concept of pluggable
scheduler and high rate request handling– February 2004: Integrated prototype of the whole
system– April 2004: Production system ready for deployment
• Progress monitoring– Using the Project/Task manager provided by LCG
Savannah portal for the CASTOR project: http://savannah.cern.ch/projects/castor/
– Progress reviews at each milestone? are the experiments interested in providing efforts for helping with review?
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 10
Progress up to nowmstaged/sstaged design
•Prototype with LSF scheduler ready•Prototype with Maui almost ready
Security design•Design paper ready. Defines
•Strategies for securing CASTOR•Emphasizes tight coupling with site security infrastructure
•Well received by CERN security team
Security implementation•Generic Csecurity module on top of GSS-API krb5 & GSI automatically supported•Dynamic loading allowing run-time setting or negotiation of preferred authentication method
ReqHandler design, phase I•Proof of concept request queuing
•Oracle & MySQL supported•API for mstaged/sstaged scheduler to pop off requests from the queue•No catalogue DB schema
rtcpd modifications, design•Design paper describing the modifications for dynamical extension of migration/recall streams
rtcpd modifications, Implementation
•Implementation ready of designed modifications
DMAPI evaluation•Full DMAPI client prototype for XFS•Evaluating the option of using DMAPI internally in CASTOR (between stager and tape archive)
Logging facility•Implements Universal Logger Message (ULM) format standard•Multiple logging destinations supported
•Local logfiles•Central logging DB (Oracle or MySQL)
20/10/2003 HEPix 2003, Vancouver, CASTOR evolution 11
Conclusions
• CASTOR@CERN status OK. • New CASTOR stager: The proposal aims for
– A pluggable framework for intelligent and policy controlled file access scheduling
– Evolvable storage resource sharing facility framework rather than a total solution
– File access request running/control and local resource allocation delegated to disk servers
• The progress is on track except for the design of the new catalogue. Milestones are still OK.