londongrid status duncan rand. slide 2 gridpp 21 swansea londongrid status londongrid five...
TRANSCRIPT
LondonGrid Status
Duncan Rand
Slide 2GridPP 21 Swansea LondonGrid Status
LondonGrid
• Five Universities with seven GOC sites– Brunel University– Imperial HEP and Imperial LeSC– Queen Mary – Royal Holloway – UCL Central and UCL-HEP
• Thirteen compute clusters, eight storage elements
Slide 3GridPP 21 Swansea LondonGrid Status
Administrative News
• Brunel – new sys-admin starts beginning October• RHUL – sys-admin post vacant since Jan 08 - starting recruitment soon
• QMUL – sys-admin post vacant - about to be advertised
• Imperial - sys-admin went on maternity leave start Aug – Barry MacEvoy replaces her part time
• LondonGrid - recruited new Technical coordinator– 0.25 FTE Imperial sys-admin– starts mid-September
Slide 4GridPP 21 Swansea LondonGrid Status
Brunel
• Site has been running OK over last 6 months– predominantly as CMS site but also ATLAS MC– however has recently suffered air conditioning problems
– DPM too small and somewhat unreliable
• Recent purchase of 60TB disk and 1 MSI2k CPU - in process of installing
• Still only 400 Mbit/s WAN link
Site news
Slide 5GridPP 21 Swansea LondonGrid Status
Imperial
• HEP – Workhorse ce00 running well– Mainly CMS analysis and MC– ATLAS MC jobs changed to stage out to RHUL SE– IBM cluster (gw39) retired– Recent purchase of 100 TB disk and ~2.4 MSI2k CPU– Added in new 120 WN to ce00 this week– Will the air conditioning cope?
• ICT/ LeSC– Essentially run ATLAS and CMS MC plus biomed
Slide 6GridPP 21 Swansea LondonGrid Status
Royal Holloway
• Manpower low - awaiting new system administrator
• Nevertheless commissioned new cluster in April 2008
• Generally runs well• Have had networking issues
– WAN issues seem to have been related to external firewall
• Set up as CMS Tier-3
Slide 7GridPP 21 Swansea LondonGrid Status
Queen Mary
• No full time admin• Makeover completed• Now running SGE
– CE gets overloaded need to upgrade it
• DPM on Lustre – works OK – 50 MB/s WAN access, not yet stress tested LAN
– will replace DPM head node with modern machine (10 Gbit/s)
• Storm test SE – works OK but difficult to add ATLAS space tokens
• Running mainly ATLAS and biomed jobs– need to get CMS MC running ASAP
Slide 8GridPP 21 Swansea LondonGrid Status
University College
• UCL-Central finally passing SAM tests yesterday
• Need to complete acceptance tests, install ATLAS software and get MC running
• SE working – requires update of space tokens
• UCL-HEP purchasing new equipment- should bring online soon
Slide 10
GridPP 21 Swansea LondonGrid Status
CPU contribution by site
• RHUL obvious arrival!
• QMUL/LeSC –increase
Slide 11
GridPP 21 Swansea LondonGrid Status
CPU usage by VO
• CMS and ATLAS are the big users in last 6 months
• Also biomed• LHCb still
low
• Concentrating on CMS & ATLAS… 0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
atlas cmsbiomed
compchem
lhcbpheno zeus hone
ilcfusion
Norm CPU (kSI2k hours)
UCL-HEPRHULQMULIC-LeSCIC-HEPBrunel
Slide 13
CPU usage by Tier-2
GridPP 21 Swansea LondonGrid Status
Slide 14
GridPP 21 Swansea LondonGrid Status
CMS
• Imperial (T2-London-IC) and Brunel (T2-London-Brunel)– Running OK– SE’s quite full
• London plan to use non-CMS sites as CMS Tier-3’s– Make use of T3-UK-London-RHUL cluster when ATLAS not using it
– ~75 TB CMS data already on site– Up to 100 MB/s download from multiple CMS Tier-1’s– Running many analysis jobs (mostly Imperial users)– Just joined MC production– A real success in terms of supporting multiple VO’s
Slide 15
GridPP 21 Swansea LondonGrid Status
CMS download to T3_UK London_RHUL
ppMuX_pt10
2.4TB transferred in an afternoon
Slide 16
GridPP 21 Swansea LondonGrid Status
CMS analysis jobs
From 2008-08-23 to 2008-08-26
Slide 17
GridPP 21 Swansea LondonGrid Status
http://dashb-cms-sv.cern.ch/dashboard/request.py/siteviewhome
CMS Site View Monitoring Page
• Excellent page summarising status of sites from CMS point of view
• The much sought after single source of information
Slide 18
Other VO ‘s
• vo.londongrid.ac.uk for local users • Set up local LFC catalogue for
– supernemo.vo.eu-egee.org– mice– vo.londongrid.ac.uk
• UKQCD VO supported at RHUL– large memory jobs and possibly MPI
GridPP 21 Swansea LondonGrid Status
Slide 19
GridPP 21 Swansea LondonGrid Status
Conclusion
• Still suffering from an acute shortage of manpower– New Technical coordinator starting very soon and three admin posts will be filled in autumn – hopefully!
• Nevertheless Royal Holloway and QMUL sites brought on line and now contributing as ATLAS sites
• RHUL also acting successfully as CMS Tier-3 – MC and analysis
• Imperial and Brunel continue to service CMS and also run ATLAS MC
• Looking forward to increased contribution by UCL
Slide 20
GridPP 21 Swansea LondonGrid Status
Thanks to all of the LondonGrid Team
Mona Aggarwal, David Colling, Austin Chamberlain, Clare Gryce, Simon George, Kostas Georgiou,
Barry Green, Paul Kyberd, William Hay, Alex Martin, Giuseppe Mazza, Henry Nebrensky,Gianfranco Sciacca, Keith Sephton, Ben Waugh, Jeremy
Yates