j coles escience centre storage at ral tier1a jeremy coles [email protected]

31
J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles [email protected]

Upload: myles-norton

Post on 05-Jan-2016

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Storage at RAL Tier1A

Jeremy [email protected]

Page 2: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Outline

• Disk– Current Status and Plans– dCache

• Tape– Current Status and History– SRM etc– Plans

• Hardware

• Software

Page 3: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Tier1A Disk

• 2002-3 80TB– Dual Processor Server– Dual channel SCSI

interconnect– External IDE/SCSI

RAID arrays (Accusys and Infortrend)

– ATA drives (mainly Maxtor)

– Cheap and (fairly) cheerful

• 2004 (140TB)– Infortrend Eonstore

SATA/SCSI RAID Arrays

– 16*250GB Western Digital SATA per array

– Two arrays per server

Page 4: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Implementation

• Used by BaBar and other experiments as well as LHC

• 60 disk servers nfs-exporting their filesystems– Potential scaling problems if every cpu node

wants to use the same disk

• Servers allocated to VOs so no contention or interference

• Need a means of pooling servers– so looked at dCache

Page 5: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Why we tried dCache?

• Gives you a virtual file space across many file systems optionally on several nodes

• Allows replication within file space to increase redundancy

• Allows a tape system to be interfaced at the back to further increase redundancy and storage available

• Data protocols are scalable, one GridFTP interface per server is easy and transparent

• Only SRM available for disk pools

Page 6: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

dCache Doors

• Doors (interfaces) can be created into the system– GridFTP

– SRM

– GSIDCAP• GFAL gives you a POSIX interface to this.

• All of these are GSI enabled but Kerberos doors also exist

• Everything remains consistent regardless of the door that is used

Page 7: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

History of dCache at RAL

• Mid 2003– We deployed a non grid

version for CMS. It was never used in production.

• End of 2003/Start of 2004– RAL offered to package a

production quality dCache.

– Stalled due to bugs and holidays

– went back to developers and LCG developers.

• September 2004– Redeployed

DCache into LCG system for CMS, and DTeam VOs.

• dCache deployed within JRA1 testing infrastructure for gLite i/o daemon testing.

Page 8: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

dCache at RAL today

• Now deployed for ATLAS, CMS, DTeam and LHCb.

• 5 disk servers made up of 16 * 1.7TB partitions.

• CMS, the only serious users of dCache at RAL, have saved 2.5 TB in the system.

• They are accessing byte ranges via the GSIDCAP posix interface.

Page 9: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Pool Group Per VO

• We found we could not quota file space between VOs.– Following advice, DCache redeployed with a pool

group per VO.

– Still only one SRM frontend. Data channels to it will be switched off as we found that data transfers kill the head node.

• Now unable to publish space per VO…

Page 10: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Current Deployment at RAL

Page 11: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Transfer into DCache

Page 12: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Transfer into DCache

Page 13: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Extreme Deployment.

Page 14: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Current Installation Technique

• The Tier-1 now has its own cookbook to follow but it is not generic at this time.

• Prerequisites– VDT for certificate infrastructure.– edg-mkgridmap for grid-mapfile.– J2RE.– Host certificate for all nodes with a GSI door.

Page 15: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Unanswered Questions

• How do we drain a node for maintenance?– CHEP papers and statements from developers

say this is possible.

• How do we support small VOs.– 1.7TB our standard partition size and pools fill

a partition.

Page 16: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Other interfaces

• SRB– RAL supports (and

develops) SRB for other communities

– Ran MCAT for worldwide CMS simulation data.

– SRB is interfaced to the Atlas datastore

– Committed to supporting SRB

• Xrootd– xrootd interface to the

BaBar data held on disk and in the ADS at RAL

– ~15 TB of data of which about 10TB is in the ADS Will expand both to about 70TB in the next few months.

– BaBar is planning to use xrootd access for background and Conditions files for Monte Carlo production on LCG, basic test have been run on WAN access to xrootd in Italy and RAL will be involved in more soon

Page 17: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Tape Overview

• General purpose, multi user, data archive.

• In use over 20 years. Four major upgrades.

• Current capacity 1PB – largest (non dedicated) multi user system in UK academia?

History• M860

– 110GB

• STK 4400– 1.2Tbytes

• IBM 3494– 30Tbytes

• STK 9310– 1Pbyte

Page 18: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

STK 9310

8 x 9940 tape drives

ADS_switch_1 ADS_Switch_2

Brocade FC switches

4 drives to each switch

ermintrudeAIX

dataserver

florenceAIX

dataserver

zebedeeAIX

dataserver

dougalAIX

dataserver

mchenry1AIXTest flfsys

basilAIXtest

dataserver

brianAIXflfsys

ADS0CNTRRedhatcounter

ADS0PT01Redhat

pathtape

ADS0SB01Redhat

SRB interface

dylanAIX

Import/exportbuxtonSunOSACSLS

User

array4 array3 array2 array1

catalogue

cache

catalogue

cache

Test system

SRB Inq; S commands; MySRB

Tape devices

ADStape

ADS sysreq

admin commandscreate query

User pathtape commands

Logging

Physical connection (FC/SCSI)

Sysreq udp command

User SRB command

VTP data transfer

SRB data transfer

STK ACSLS command

All sysreq, vtp andACSLS connections to dougal also apply tothe other dataserver machines, but are left out for clarity

Production system

SRB pathtape commands

Thursday, 04 November 2004

Page 19: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

flfstk

tapeserv

Farm Server

flfsys(+libflf)

user

flfscan

data transfer (libvtp)

catalogue data

STK tape drive

cellmgr

Catalogue Server (brian)

flfdoexp(+libflf)

flfdoback(+libflf)

datastore (script)

Robot Server (buxton)

ACSLS

API

control info(mount/dismount)

data

data

Tape Robot

flfsys user commands (sysreq)

SE

recycling (+libflf)

read

read

read

Atlas Datastore Architecture

28 Feb 03 - 2 B Strong

SSI

CSI

flfsys farm commands (sysreq)

LMU

flfsys admin commands(sysreq)

administrators

flfaio

flfaio

flfaio

IBM tape drive

flfqryoff(copy of

flfsyscode)

Backupcatalogue

stats

flfsys tapecommands

(sysreq)

servesys

pathtape

long name(sysreq)

short name(sysreq)

frontend

backendPathtape Server (rusty)

(sysreq)

importexport

flfsys import/export commands (sysreq)

libvtp User Node

I/E Server(dylan)

?

Copy BCopy C

ACSLS

cache disk

Copy A

vtp

vtp

user program

tape

(sysreq)

Page 20: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Hardware upgrade - completed Jun 2003

• STK 9310 “Powderhorn” with 6000 slots (1.2Pbytes)• 4 IBM 3590 B drives now phased out

– 10 Gbyte native– 10 Mbyte/s transfer

• 8 New STK 9940B drives– 200 Gbyte native– 30Mbytes/sec/drive transfer – 240Mbyte/sec theoretical maximum bandwidth

• 4 RS6000 Data servers (+ 4 “others”)• 1Gbit networking (Expected to become 10Gbit by

2005)• Data Migration to new media completed ~ Feb 2004

Page 21: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Strategy

• De-couple user and application from storage media. • Upgrades and media migration occur “behind the scenes”• High resilience - very few Single Point Failures• High reliability high, availability (99.9986%)• Constant environmental monitoring linked to alarm/call out• Easy to exploit (endless) new technology• Lifetime data integrity checks hardware and software• Fire safe and off-site backups; Tested disaster recovery

procedures; media migration, recycling• Technology watch to monitor future technology path

Page 22: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Supported interfaces

• We have successfully implemented a variety of layers on top of ADS to support standard interfaces

• FTP, OOFS, Globus IO, SRB, EDG SE, SRM, xrootd– so we can probably support others

Page 23: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Overall Storage Goals – GridPP2

• Provide SRM interfaces to:– The Atlas Petabyte Storage facility at RAL– Disk (for Tier 1 and 2 in UK)– Disk pools (for Tier 1 and 2 in UK)

• Deploy and support interface to Atlas Datastore

• Package and support interfaces to disk

Page 24: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Current status

• EDG-SE interface to ADS– published as SE in LCG– supported by edg-rm

• SRM v1.1 interface to ADS– Tested with GFAL (earlier versions, <1.3.7)– Tested with srmcp (the dCache client)– Based on EDG Storage Element– Also interfaces to disk

• Also working with RAL Tier 1 on dCache– Install and support, including the SRM

Page 25: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

(Short Term) Timeline

• Provide a release of SRM to disk and disk array by end of January 2005

• Coincide with the EGEE gLite “release”

• Plan to match the path toward the full gLite release

Page 26: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

(Short Term) Strategy

• Currently considering both EDG SE and dCache

Storage Element

dCache + dCache-SRM

SRM toADS

SRM todisk

SRM todisk pool

Look at both to meet all goals:• Some duplicated effort, but helps mitigating risks• Can fall back to only one (which may be dCache)• In the long term, we will probably have a single solution

Page 27: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Acceptance tests

• SRM tests – SRM interface must work with:– srmcp (the dCache SRM client)– GFAL– gLite I/O

• Disk pool test – must work with– dccp (dCache specific)– plus SRM interface on top

Page 28: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Questions

• What is the future of CERN’s DPM?– We want to test it

• Should we start implementing SRM 3?

• Will dCache ever go Open Source?

Page 29: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Planned Tape Capacity

TB 2004 2005 2006 2007 2008

LHC 483 1150 1639 1573

Total 882 1500 2100 2100

Don’t believe 2008 figures, reviewing storage in this timeframe

Page 30: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

ADS Plans

• Planning a wider UK role in Data Curation and Storage (potentially 10-20PB by 2014)

• Review software layer – use of Castor possible

• Capacity plans based on adding STK Titanium 1 in 2005/06 and Titanium 2 in 2008/09

Page 31: J Coles eScience Centre Storage at RAL Tier1A Jeremy Coles j.coles@rl.ac.uk

J Coles

eScience Centre

Summary

• Working implementation of dCache for disk pools (main user is CMS)– Some outstanding questions– Plan to involve some Tier-2s shortly

• We will review other implementations as they become available

• RAL ADS supports SRB and xrootd for other communities.