Download - 1 INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff 28 th October 2009

1

INFN-T1 site report

Andrea Chierici

On behalf of INFN-T1 staff

28th October 2009

Overview

Infrastructure Network Farming Storage

2

Infrastructure

3

4

INFN-T1 2005 INFN-T1 2009

Racks 40 120

Power source University Directly from supplier (15kV)

Power Transformer

1 (~1MVA) 3 (~2.5MVA)

UPS 1 diesel engine/UPS (~640kVA)

2 Rotary UPS (~3400kVA) + 1 diesel engine (~640kVA)

Chiller 1 (~530kVA) 7 (~2740kVA)

5

UPS up to 3,8 MW

15000 V

1.4 MW

1 MW

Chillers

1.4 MW

1.2 MW

Mechanical and electrical surveillance

Network

7

INFN CNAF TIER1 Network

7600

GARR

2x10Gb/s

10Gb

/sExtermeBD10808

4x10Gb

/s

10Gb/s

LHC-OPNdedicated link 10Gb/s

T0-T1 (CERN)T1-T1 (PIC,RAL,TRIUMPH)

•T1-T1’s (BNL,FNAL,TW-ASGC,NDGF)•T1-T2’s

•CNAF General purpose

ExtermeBD8810

Worker Nodes

Worker Nodes

2x1Gb/s

2x1Gb/s

Extreme Summit450

Extreme Summit450

4x1Gb

/s

Extreme Summit450

Worker Nodes

4x1Gb/s

2x10

Gb/

s

Extreme Summit400

Storage Servers•Disk Servers

•Castor StagersFiber Channel

Storage Devices

SANExtreme

Summit400

In Case of network Congestion: Uplink upgrade from 4 x 1Gb/s

to 10 Gb/s or 2x10Gb/s

FC director

LHC-OPNCNAF-KIT

CNAF-IN2P3CNAF-SARA

T0-T1 BACKUP 10Gb/s

WAN

RALPIC

TRIUMPH

CiscoNEXUS 7000

Farming

9

New tender

1U Twin solution with these specs: 2 Intel Nehalem E5520 @2.26GHz 24GB RAM 2x 320 GB SATA HD @7200 rpm, 2x 1Gbps Ethernet

118 twin, reaching 20500 HEP-SPEC, measured on SLC44

Delivery and installation foreseen within 2009

10

Computing resources

Including machines from new tender, INFN-T1 computing power will reach 42000 HEP-SPEC within 2009

Further increase within January 2010 will bring us to 46000 HEP-SPEC

Within may 2010, we will reach 68000 HEP-SPEC (as we pledged to WLCG) This basically will triple current computing power

11

Resource usage per VO

12

KSI2K pledged vs used

13

New accounting system

Grid, local and overall job visualization Tier1/Tier2 separation Several parameters monitored

avg and max RSS, avg and max Vmem added in latest release

KSI2K/HEP-SPEC accounting WNoD accounting Available at: http://tier1.cnaf.infn.it/monitor Feedback welcome to: [email protected]

14

New accounting: sample picture

15

GPU Computing (1)

We are investigating GPU computingNVIDIA Tesla C1060, used for porting

software and performing comparison testshttps://agenda.cnaf.infn.it/conferenceDisplay.

py?confId=266, meeting with Bill Dally (chief scientist and vice president of NVIDIA).

16

GPU Computing (2)

Applications currently tested: Bioinformatics: CUDA-based paralog filtering in

Expressed Sequence Tag clusters Physics: Implementing a second order

electromagnetic particle in cell code on the CUDA architecture

Physics: Spin-Glass Monte Carlo Simulations First two apps showed more than 10x

increase in performance!!

17

GPU Computing (3)

We plan to buy 2 more workstations in 2010, with 2 GPU each. We wait for the FERMI architecture, foreseen for

spring 2010 We will continue the activities currently

ongoing and will probably test some monte carlo simulations for superB

We plan to test selection and shared usage of GPUs via grid

18

Storage

19

2009-2010 tenders

Disk tender requested Baseline: 3.3 PB raw (~ 2.7 PB-N)

1st option: 2.35 PB raw (~ 1.9 PB-N) 2nd option: 2 PB raw (~ 1.6 PB-N) Options to be requested during Q2 and Q3 2010

New disk in production ~ end of Q1 2010 4000 tapes (~ 4 PB) acquired with library

tender4.9 PB needed beginning of 20107.7 PB probably needed by half 2010

21

Castor@INFN-T1 To be upgraded to 2.1.7-27 1 Srm v 2.2 end-points available

Supported protocols: rfio, gridftp Still cumbersome to manage

requires frequent intervention in the Oracle db Lack of management tools

CMS migrated to StoRM for D0T1

22

WLCG Storage Classes at INFN-T1 today Storage Class – offer different levels of storage

quality (e.g. copy on disk and/or on tape) DnTm = n copies on disk and m copies on tape

Implementation of 3 Storage Classes needed for WLCG (but usable also by non-LHC experiments) Disk0-Tape1 (D0T1) or “custodial nearline”

Data migrated to tapes and deleted from disk whenstaging area full

Space managed by system Disk is only a temporary buffer

Disk1-Tape0 (D1T0) “replica online” Data kept on disk: no tape copy Space managed by VO

Disk1-Tape1 (D1T1) “custodial online” Data kept on disk AND one copy kept on tape Space managed by VO (i.e. if disk is full, copy fails)

CurrentlyCASTOR

CurrentlyGPFS/TSM

+ StoRM

23

YAMSS: present status Yet Another Mass Storage System

Scripting and configuration layer to interface GPFS&TSM Can work driven by StoRM or stand-alone

Experiments not using the SRM model can work with it GPFS-TSM (no StoRM) interface ready

Full support for migrations and tape ordered recalls StoRM

StoRM in production at INFN-T1 and in other centres around the world for “pure” disk access (i.e. no tape)

integration with YAMSS for migrations and tape ordered recalls ongoing (almost completed)

Bulk migrations and recalls tested with a typical use case (stand-alone YAMSS, without StoRM) Weekly production workflow of the CMS experiment

24

Why GPFS&TSM Tivoli Storage Manager (developed by IBM) is a

tape oriented storage manager widely used (also in HEP world, e.g. FZK) Built-in functionality present in both products to

implement backup and archiving from GPFS. The development of a HSM solution is based on

the combination of features of GPFS (since v.3.2) and TSM (since v.5.5). Since GPFS v.3.2 the new concept of “external

storage pool” extends use of policy driven Information Lifecycle Management (ILM) to tape storage.

External pools are real interfaces to external storage managers, e.g. HPSS or TSM HPSS very complex (no benefits in this sense

compared to CASTOR)

25

YAMSS: hardware set-up

20x4 Gbps

~ 500 TBfor GPFSon CX4-960

4 GridFTP servers (4x2 Gbps)6 NSD servers (6x2 Gbps) on LAN

HSMSTA

HSMSTA

HSMSTA

8x4 Gbps

3x4 Gbps

3x4 Gbps

db

8x4 Gbps 8 tape drives T10KB:- 1 TB per tape, - 1 Gbps per driveTAN

SAN

TSM server

4 Gbps FC

YAMSS: validation tests

Concurrent access in r/w to MSS for transfers and from farmStoRM not used in these tests

3 HSM nodes serving 8 T10KB drives6 drives (at maximum) used for recalls2 drives (at maximum) used for migrations

Order of 1GB/s of aggregated traffic

26

• ~550 MB/s from tape to disk

• ~100 MB/s from disk to tape

• ~400 MB/s from disk to the computing

nodes (not shown in this graph)

Questions?

27

Download - 1 INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff 28 th October 2009

Top Related