tier 2 prague institute of physics as cr

21
Tier 2 Prague Institute of Physics AS CR Status and Outlook J. Chudoba, M. Elias, L. Fiala, J. Horky, T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka 13. September 2013 1 NEC2013 Varna M. Lokajicek

Upload: snana

Post on 22-Mar-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Tier 2 Prague Institute of Physics AS CR. Status and Outlook J. Chudoba , M. Elias, L. Fiala, J. Horky, T . Kouba, J. Kundrat , M. Lokajicek , J . Svec , P. Tylka. Outline. Institute of Physics AS CR ( FZU ) Computing Cluster Networking LHCONE Looking for new resources - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tier 2 Prague Institute of Physics AS CR

NEC2013 Varna M. Lokajicek 1

Tier 2 PragueInstitute of Physics AS CR

Status and OutlookJ. Chudoba, M. Elias, L. Fiala, J. Horky,

T. Kouba, J. Kundrat, M. Lokajicek, J. Svec, P. Tylka

13. September 2013

Page 2: Tier 2 Prague Institute of Physics AS CR

2

Outline

• Institute of Physics AS CR (FZU)• Computing Cluster• Networking• LHCONE• Looking for new resources

– CESNET National Storage Facility– IT4I supercomputing project

• Outlook

11 September 2013

Page 3: Tier 2 Prague Institute of Physics AS CR

3

Institute of Physics AS CR (FZU)• Institute of Physics of the Academy

of the Czech Republic• 2 locations in Prague, 1 in Olomouc

– In 2012: 786 employees (281 researchers + 78 doctoral students)– 6 Divisions

• Division of Elementary Particle Physics• Division of Condensed Matter Physics• Division of Solid State Physics• Division of Optics• Division of High Power Systems• ELI Beamlines Project Division

• Department of Networking and Computing Techniques (SAVT)

11 September 2013

Page 4: Tier 2 Prague Institute of Physics AS CR

4

FZU - SAVT

• Institute’s networking and computing service department– Several server rooms– Computing clusters

Golias – Particle physics, Tier2• Few nodes from already before EDG• WLCG iMoU 4 July 2003 (interim)• New server room 1 November 2004• WLCG MoU from 28 April 2008

– Dorje – solid state, condensed matterLuna, Thsun, smaller group clusters

11 September 2013

Page 5: Tier 2 Prague Institute of Physics AS CR

5

Main server room• Main server room (in FZU, Na Slovance)

– 62 m2, ~20 racks, 350 kVA motor generator, 200 + 2 x 100 kVA UPS, 108 kW air cooling, 176 kW water cooling

– continuous changes– hosts computing servers and central services

11 September 2013

Page 6: Tier 2 Prague Institute of Physics AS CR

611 September 2013

Page 7: Tier 2 Prague Institute of Physics AS CR

7

Cluster Golias• Upgraded every year

several (9) sub-clusters of the identical HW• 3800 cores, 30 700 HS06• 2 PB disk space

• Tapes used only for local backups (125 LTO4, max 500 cassettes)

• Serving: ATLAS, ALICE, D0 (NOvA), Auger, STAR, …

• WLCG Tier2Golias@FZU + xrootd servers@REZ (NPI)

11 September 2013

Page 8: Tier 2 Prague Institute of Physics AS CR

8

Utilization

• Very high average utilization– Several different projects, different tools for production– D0 – production submitted locally by 1 user– ATLAS – panda, ganga, local users; DPM– ALICE – VO box; xrootd

11 September 2013

D0

ATLAS

ALICE

3.5 k

Page 9: Tier 2 Prague Institute of Physics AS CR

9

„RAW“ CapacitiesHEPSPEC2006 % TB disk %

2009 10 340 186

2010 19 064 100 427 100

2011 23 484 100 1 714 100

2012 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7 540 25 606(+100 Řež) 24

2013 29 660 100 2521 100

D0 9 993 34 35 1

ATLAS 12 127 41 1880 (+16 MFF) 74

ALICE 7540 25 606(+140 Řež) 24

11 September 2013

Page 10: Tier 2 Prague Institute of Physics AS CR

10

2012 D0, ATLAS and ALICE usage

• ATLAS• 2,2 M tasks• 90 MHEPSEPC06 hours,

1,9 PB disk space• Data transfer 1,2 PB to farm

0,9 PB from farm

• 2% contribution to ATLAS

• ALICE• 2 M simulation tasks• 60 MHEPSEC06 hours• Data transfer 4,7 PB to farm and 0,5 PB from

farm• 5% our contribution to ALICE tasks processing

• 140TB disk space in INF (Tier3)

2012-01

2012-02

2012-03

2012-04

2012-05

2012-06

2012-07

2012-08

2012-09

2012-10

2012-110.0

500.0

1000.0

1500.0

2000.0

2500.0

3000.0

incommingoutgoing

2012 Data transfers inside farm - month means to and from working nodes in TB

11 September 2013

• D0• 290 M tasks• 90 MHEPSPEC06 hours• 13% contribution to D0

Page 11: Tier 2 Prague Institute of Physics AS CR

1111 September 2013

Page 12: Tier 2 Prague Institute of Physics AS CR

12

Network - CESNET, z. s. p. o.• FZU Tier2 Network connections

– 10 Gbps LHCONE (GEANT), 18 July 2013– 10 Gbps KIT from 1st Sept 2013

11 September 2013

– 1 Gbps FNAL, BNL, Taipei– 10 Gbps to commodity network– 1-10 Gbps to Tier3 collaborating institutes

http://netreport.cesnet.cz/netreport/hep-cesnet-experimental-facility2/

Page 13: Tier 2 Prague Institute of Physics AS CR

13

LHCONE - Network transition

11 September 2013

• Link to KIT saturated at 1 Gbps E2E line

• LHCONE from 18 July 2013 over 10 Gbps infrastructure

• Relieves also the commodity network

10 Gbps

Page 14: Tier 2 Prague Institute of Physics AS CR

14

Atlas tests

• Testing upload speed of files > 1 GB to all Tier1 centra

• After LHCONE connection only 2 sites with < 5MB/s

• Prague Tier2 ready for validation as T2D

11 September 2013

30 60

Page 15: Tier 2 Prague Institute of Physics AS CR

15

LHCONE – trying to understand monitoring

• L

11 September 2013

Prague – DESY Very asymmetric throughput

LHCONE line cut

DESY – PragueLHCONE optical line cutAt 4:00One way latency improved

Page 16: Tier 2 Prague Institute of Physics AS CR

16

International contribution of Prague center to ATLAS + ALICE centra T2 LCG

• http://accounting.egi.eu/• Grid + local tasks• Long term slide down until we

received regular financing in 2008

• Original 3% target is not achievable with current financial resources

• Necessary to look for other resources

2005 2006 2007 2008 2009 2010 2011 20120

1

2

3

4

5

6

jobs %cpu %

11 September 2013

Page 17: Tier 2 Prague Institute of Physics AS CR

17

Remote storage

11 September 2013

• CESNET - Czech NREN + other services

• New project: National storage facility

FZU Tier-2 in Prague

CESNET storage site in Pilsen

100km

FZU<->Pilsen - 10Gbit link with ~3.5ms latency

• Three distributed HSM based storage sites• Designed for research and science

community– 100TB for both ATLAS and Auger

experiments offered– Implemented as remote Storage Element

with dCache– disk <-> tape migration

Page 18: Tier 2 Prague Institute of Physics AS CR

18

Remote storage

11 September 2013

remote/local Method TTreeCache events/s ( %) Bytes transferred %CPU Efficiency

local rfio ON 100% 117% 98,9%

local rfio OFF 74% 100% 72,7%

remote dCap ON 75% 101% 73,5%

remote dCap OFF 46% 100% 46,9%

• TTreeCache in ROOT helps a lot – both for local and for remote transfers

• TTreeCached remote jobs faster than local ones without the cache

Influence of distributing a Tier-2 data storage on physics analysis

Page 19: Tier 2 Prague Institute of Physics AS CR

19

Outlook • In 2015 after LHC start up

– Higher data production– Flat financing not sufficient– Computing can become an item of

M&O A (Maintenance &Operations cat. A)

• Search for new financial resources or new unpaid capacities necessary– CESNET

• Crucial free delivery of network infrastructure

• Unpaid External storage, how long?– IT4I, Czech supercomputing project

search for computing capacities (free cycles), relying on other project to find the way how to use them

11 September 2013

Page 20: Tier 2 Prague Institute of Physics AS CR

20

• 16th International workshop on Advanced Computing and Analysis Techniques in physics (ACAT)

• http://www.particle.cz/acat2014

• Topics• Computing Technology

for Physics Research• Data Analysis -

Algorithms and Tools• Computations in

Theoretical Physics: Techniques and Methods

11 September 2013

Page 21: Tier 2 Prague Institute of Physics AS CR

21

Backup

11 September 2013