tier 2 prague institute of physics as cr

Post on 22-Mar-2016

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Tier 2 Prague Institute of Physics AS CR. Status and Outlook J. Chudoba , M. Elias, L. Fiala, J. Horky, T . Kouba, J. Kundrat , M. Lokajicek , J . Svec , P. Tylka. Outline. Institute of Physics AS CR ( FZU ) Computing Cluster Networking LHCONE Looking for new resources - PowerPoint PPT Presentation

TRANSCRIPT

NEC2013 Varna M. Lokajicek 1

Tier 2 PragueInstitute of Physics AS CR

Status and OutlookJ. Chudoba, M. Elias, L. Fiala, J. Horky,

T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka

13. September 2013

2

Outline

• Institute of Physics AS CR (FZU)• Computing Cluster• Networking• LHCONE• Looking for new resources

– CESNET National Storage Facility– IT4I supercomputing project

• Outlook

11 September 2013

3

Institute of Physics AS CR (FZU)• Institute of Physics of the Academy

of the Czech Republic• 2 locations in Prague, 1 in Olomouc

– In 2012: 786 employees (281 researchers + 78 doctoral students)– 6 Divisions

• Division of Elementary Particle Physics• Division of Condensed Matter Physics• Division of Solid State Physics• Division of Optics• Division of High Power Systems• ELI Beamlines Project Division

• Department of Networking and Computing Techniques (SAVT)

11 September 2013

4

FZU - SAVT

• Institute’s networking and computing service department– Several server rooms– Computing clusters

Golias – Particle physics, Tier2• Few nodes from already before EDG• WLCG iMoU 4 July 2003 (interim)• New server room 1 November 2004• WLCG MoU from 28 April 2008

– Dorje – solid state, condensed matterLuna, Thsun, smaller group clusters

11 September 2013

5

Main server room• Main server room (in FZU, Na Slovance)

– 62 m2, ~20 racks, 350 kVA motor generator, 200 + 2 x 100 kVA UPS, 108 kW air cooling, 176 kW water cooling

– continuous changes– hosts computing servers and central services

11 September 2013

611 September 2013

7

Cluster Golias• Upgraded every year

several (9) sub-clusters of the identical HW• 3800 cores, 30 700 HS06• 2 PB disk space

• Tapes used only for local backups (125 LTO4, max 500 cassettes)

• Serving: ATLAS, ALICE, D0 (NOvA), Auger, STAR, …

• WLCG Tier2Golias@FZU + xrootd servers@REZ (NPI)

11 September 2013

8

Utilization

• Very high average utilization– Several different projects, different tools for production– D0 – production submitted locally by 1 user– ATLAS – panda, ganga, local users; DPM– ALICE – VO box; xrootd

11 September 2013

D0

ATLAS

ALICE

3.5 k

9

„RAW“ CapacitiesHEPSPEC2006 % TB disk %

2009 10 340 186

2010 19 064 100 427 100

2011 23 484 100 1 714 100

2012 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7 540 25 606(+100 Řež) 24

2013 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7540 25 606(+140 Řež) 24

11 September 2013

10

2012 D0, ATLAS and ALICE usage

• ATLAS• 2,2 M tasks• 90 MHEPSEPC06 hours,

1,9 PB disk space• Data transfer 1,2 PB to farm

0,9 PB from farm

• 2% contribution to ATLAS

• ALICE• 2 M simulation tasks• 60 MHEPSEC06 hours• Data transfer 4,7 PB to farm and 0,5 PB from

farm• 5% our contribution to ALICE tasks processing

• 140TB disk space in INF (Tier3)

2012-01

2012-02

2012-03

2012-04

2012-05

2012-06

2012-07

2012-08

2012-09

2012-10

2012-110.0

500.0

1000.0

1500.0

2000.0

2500.0

3000.0

incommingoutgoing

2012 Data transfers inside farm - month means to and from working nodes in TB

11 September 2013

• D0• 290 M tasks• 90 MHEPSPEC06 hours• 13% contribution to D0

1111 September 2013

12

Network - CESNET, z. s. p. o.• FZU Tier2 Network connections

– 10 Gbps LHCONE (GEANT), 18 July 2013– 10 Gbps KIT from 1st Sept 2013

11 September 2013

– 1 Gbps FNAL, BNL, Taipei– 10 Gbps to commodity network– 1-10 Gbps to Tier3 collaborating institutes

http://netreport.cesnet.cz/netreport/hep-cesnet-experimental-facility2/

13

LHCONE - Network transition

11 September 2013

• Link to KIT saturated at 1 Gbps E2E line

• LHCONE from 18 July 2013 over 10 Gbps infrastructure

• Relieves also the commodity network

10 Gbps

14

Atlas tests

• Testing upload speed of files > 1 GB to all Tier1 centra

• After LHCONE connection only 2 sites with < 5MB/s

• Prague Tier2 ready for validation as T2D

11 September 2013

30 60

15

LHCONE – trying to understand monitoring

• L

11 September 2013

Prague – DESY Very asymmetric throughput

LHCONE line cut

DESY – PragueLHCONE optical line cutAt 4:00One way latency improved

16

International contribution of Prague center to ATLAS + ALICE centra T2 LCG

• http://accounting.egi.eu/• Grid + local tasks• Long term slide down until we

received regular financing in 2008

• Original 3% target is not achievable with current financial resources

• Necessary to look for other resources

2005 2006 2007 2008 2009 2010 2011 20120

1

2

3

4

5

6

jobs %cpu %

11 September 2013

17

Remote storage

11 September 2013

• CESNET - Czech NREN + other services

• New project: National storage facility

FZU Tier-2 in Prague

CESNET storage site in Pilsen

100km

FZU<->Pilsen - 10Gbit link with ~3.5ms latency

• Three distributed HSM based storage sites• Designed for research and science

community– 100TB for both ATLAS and Auger

experiments offered– Implemented as remote Storage Element

with dCache– disk <-> tape migration

18

Remote storage

11 September 2013

remote/local Method TTreeCache events/s ( %) Bytes transferred %CPU Efficiency

local rfio ON 100% 117% 98,9%

local rfio OFF 74% 100% 72,7%

remote dCap ON 75% 101% 73,5%

remote dCap OFF 46% 100% 46,9%

• TTreeCache in ROOT helps a lot – both for local and for remote transfers

• TTreeCached remote jobs faster than local ones without the cache

Influence of distributing a Tier-2 data storage on physics analysis

19

Outlook • In 2015 after LHC start up

– Higher data production– Flat financing not sufficient– Computing can become an item of

M&O A (Maintenance &Operations cat. A)

• Search for new financial resources or new unpaid capacities necessary– CESNET

• Crucial free delivery of network infrastructure

• Unpaid External storage, how long?– IT4I, Czech supercomputing project

search for computing capacities (free cycles), relying on other project to find the way how to use them

11 September 2013

20

• 16th International workshop on Advanced Computing and Analysis Techniques in physics (ACAT)

• http://www.particle.cz/acat2014

• Topics• Computing Technology

for Physics Research• Data Analysis -

Algorithms and Tools• Computations in

Theoretical Physics: Techniques and Methods

11 September 2013

21

Backup

11 September 2013

top related