afs update

Post on 06-Jan-2016

18 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

AFS Update. O. Le Moigne IT/PDP/DM Olivier.Le.Moigne@cern.ch 1/11/99. IT/PDP/DM Section. Section leader : H.Renshall, O.Barring Tape infrastructure unit led by C.Curran F.Collin, R.Minchin SHIFT and CASTOR software unit led by J-P. Baud O.Barring, J-D.Durand - PowerPoint PPT Presentation

TRANSCRIPT

AFS Update

O. Le Moigne IT/PDP/DM

Olivier.Le.Moigne@cern.ch

1/11/99

IT/PDP/DM Section

• Section leader : H.Renshall, O.Barring

• Tape infrastructure unit led by C.Curran

– F.Collin, R.Minchin

• SHIFT and CASTOR software unit led by J-P. Baud

– O.Barring, J-D.Durand

• Commercial products services (HPPS and AFS) unit led by H.Renshall

– B.Antoine, D.Asbury, O.Le Moigne

AFS dependencies

• server hardware/OS

• AFS server process

• Network

• AFS client software (cache manager)

• client hardware/OS

• application

AFS hardware

• 33 servers– 9 IBM (AIX)– 7 SUNs (Solaris)– 17 PCs (Solaris, Linux)

• disk space : 2TB available– user : 600 GB used / 1100 GB alloc– project : 800 GB used / 1400 GB alloc

Disk policy

• Home directories (managed by xspaceadm)– limited to 300 MB/user– mirror or RAID 5

• Project space (managed by afs_admin)– several 100GB/experiment– NO MIRROR or RAID 5 – backed up : if one disk fails, data of the day before

will be recovered from tapes – or not backed up (scratch volumes : q.*)

AFS software

• Current version : AFS 3.4a (UNIXes, NT)

• AFS 3.5– mainly for performance improvement– deployed only for Linux

• AFS 3.6 : beta under test– volumes > 2GB, Solaris 64 bits– will be deployed next year

Cache Manager problems

• 2 problems :– Cache consistency : a modification is not

visible everywhere– Cache corruption : a file is corrupted

• Solution– fs flush/flushv discouraged– should be reported to afs.support@cern.ch– new patch to be installed soon on HP nodes

Other problems

• Performance– Overloaded servers (eg: ASIS volumes moved)– Backups

• Connectivity problem– specially with FDDI

• Servers crashes (need for supported config)

• Disk failures (need for redundancy)

Replication

• Volumes can be read-only replicated on several servers– load balancing

– no trouble for clients if one server crashes

– good for everything that is accessed a lot and not modified often:

• binaries

• documentation

– give us proposal

Linux

• Version certified by IT/DIS : Redhat 5.1 using old free client from MIT (AFS 3.3)– stability problems (specially on SMP)– lots of cache consistency/corruption problems– completely unsupported (no pb investigation)

• Supported version : Redhat 6.0, AFS 3.5 (Redhat 6.1 soon)

Plans

• Update hardware : homogeneity/stability– SUN Solaris

• user : mirror or RAID 5• project : all in mirror or RAID 5 ?

– PC Linux : replica + scratch– Ethernet (100Mb/s or 1Gb/s)

• Update software– AFS 3.6 next year

• rethinking user workspace

Contact

• User problem– Atlas group administrator (quota)– Helpdesk@cern.ch and Atlas support list

• Administration problem– Afs.Support@cern.ch and Atlas support list

• Info– http://consult.cern.ch/services/afs

top related