torque/maui pic and nikhef experience c. acosta-silva , j. flix, a. pérez-calero (pic) j. templon...
DESCRIPTION
Torque/Maui PIC and NIKHEF experience C. Acosta-Silva , J. Flix, A. Pérez-Calero (PIC) J. Templon (NIKHEF). Outline. System overview Successful experience (NIKHEF and PIC) Torque/Maui current situation Torque overview Maui overview Outlook. S ystem overview. - PowerPoint PPT PresentationTRANSCRIPT
Pre-GDB on Batch Systems (Bologna)11th March 2014 1
Torque/MauiPIC and NIKHEF experience
C. Acosta-Silva, J. Flix, A. Pérez-Calero (PIC)J. Templon (NIKHEF)
Pre-GDB on Batch Systems (Bologna)11th March 2014 2
Outline
‣ System overview‣ Successful experience (NIKHEF and
PIC)‣ Torque/Maui current situation‣ Torque overview‣ Maui overview
‣ Outlook
Pre-GDB on Batch Systems (Bologna)11th March 2014 3
System overview‣ TORQUE is a community and commercial effort
based on OpenPBS project. It improves scalability, enables fault tolerance and many other features
‣ http://www.adaptivecomputing.com/products/open-source/torque/
‣ Maui Cluster Scheduler is a job scheduler capable of supporting multiple scheduling policies. It is free and open-source software
‣ http://www.adaptivecomputing.com/products/open-source/maui/
Pre-GDB on Batch Systems (Bologna)11th March 2014 4
System overview‣ TORQUE/Maui system has the usual batch system
capabilities:‣ Queues definition (routing queues)‣ Accounting‣ Reservation/QOS/Partition‣ FairShare‣ Backfilling‣ Handling of SMP and MPI jobs
‣ Multicore allocation and job backfilling ensure that Torque/Maui is capable of supporting multicore jobs
Pre-GDB on Batch Systems (Bologna)11th March 2014 5
Succesful experience ‣ NIKHEF and PIC are multi-VO sites with local & Grid users
‣ Succesful experience during first LHC run with Torque/Maui system
‣ Currently, both are running Torque-2.5.13 + Maui-3.3.4
‣ NIKHEF: 30% non-HEP, 55% WLCG, rest non-WLCG HEP or local jobs. Highly non-uniform workload
‣ 3800 jobs slots‣ 97.5% utilization (last 12 months)‣ 2000 waiting jobs (average)
Pre-GDB on Batch Systems (Bologna)11th March 2014 6
Succesful experience
NIKHEF: running jobs (last year)
NIKHEF: queued jobs (last year)
Pre-GDB on Batch Systems (Bologna)11th March 2014 7
Succesful experience ‣ PIC: 3% non-HEP, 83% Tier-1 WLCG, 12% ATLAS Tier-
2, rest local jobs (ATLAS Tier-3, T2K, MAGIC,…)‣ 3500 jobs slots‣ 95% approx utilization (last 12 months)‣ 2500 waiting jobs (average)
Pre-GDB on Batch Systems (Bologna)11th March 2014 8
Succesful experiencePIC: running jobs (last year)
Pre-GDB on Batch Systems (Bologna)11th March 2014 9
Succesful experiencePIC: queued jobs (last year)
Pre-GDB on Batch Systems (Bologna)11th March 2014 10
Torque overview
‣ Torque has a very active community:‣ Mailing list: [email protected]
‣ Total free support from Adaptive Computing‣ New releases each year (approx. or less) and
frequent new patches‣ 2.5.13 is the last release of branch 2.5.X
Pre-GDB on Batch Systems (Bologna)11th March 2014 11
Torque overview
VersionLast
release date
Patch Release
Schedule*EOL Date EOSL
DateRecommende
d Upgrade
TORQUE 4.2.x
2014-02-27(4.2.7) 12 weeks not
announcednot
announced TORQUE 4.2.x
TORQUE 4.1.x
2013-09-24(4.1.7) EOL 2014-02-10 2015-08-
10 TORQUE 4.2.x
TORQUE 4.0.x
2012-05-03(4.0.2) EOL 2012-11-26 2014-04-
26 TORQUE 4.2.x
TORQUE 3.0.x
2010-12-06(3.0.6) EOL 2012-11-26 2014-04-
26 TORQUE 4.2.x
TORQUE 2.5.x
2013-08-1(2.5.13) EOL 2013-07-20 2015-01-
20 TORQUE 4.2.x
TORQUE 2.4.x
2012-05-29(2.4.17) EOL 2012-06-01 2013-12-
01 TORQUE 4.2.x
TORQUE 2.3.x
2012-01-01(2.3.9) EOL 2012-06-01 2013-12-
01 TORQUE 4.2.x
Pre-GDB on Batch Systems (Bologna)11th March 2014 12
Torque overview‣ Torque is well integrated with EMI middleware‣ Vastly used in WLCG Grid sites (~75% of sites in
BDii -pbs-)‣ No complex to install, configure and manage:‣ via qmgr tool‣ plain text accounting ‣ …
‣ Torque scalability issues ‣ Reported for branch 2.5.X‣ Not detected at our scale‣ Branch 4.2.X presents significant enhancements to
scalability for large environments, responsiveness, reliability, …
Pre-GDB on Batch Systems (Bologna)11th March 2014 13
Maui overview
‣ Support: Maui is no longer supported by Adaptive Computing
‣ Documentation:‣ Poor documentation causes initial complexity to install
it‣ Things do not always work like the documentation
suggests‣ Scalability issues:‣ At ~8000 queued jobs, Maui hangs‣ MAXIJOBS parameter can be adjusted to limit the
number of jobs consider for scheduling‣ This solves this issue (currently in production in NIKHEF)
Pre-GDB on Batch Systems (Bologna)11th March 2014 14
Maui overview‣ Moab is the non-free scheduler supported by
Adaptive Computing and based in Maui‣ Aims to increase the scalability‣ It is a continued commercial support‣ Configuration files are very similar to the ones in
Maui: http://docs.adaptivecomputing.com/mwm/
help.htm#a.kmauimigrate.html
‣ Feedback from sites running Torque/Moab would be a good complement to this review
Pre-GDB on Batch Systems (Bologna)11th March 2014 15
Outlook‣ Torque/Maui scalability issues ‣ Only relevant for larger sites ‣ feasible option for small-medium size sites‣ Might be well solved in 4.2.X branch and tunning Maui
options‣ Actually, multicore jobs reduces the number of jobs to be
handled by the system‣ for sites that are predominantly WLCG (eg PIC at 95%),
switching to a pure multicore load would further reduce scheduling issues at the site level.
‣ for sites that are much less WLCG dominated (eg Nikhef at 55%), a switch to pure multicore load might actually increase scheduling issues at the site level, as this move would remove much of the entropy which allows reaching 97% utilization.
‣ Another concern is the support for the systems, being Maui the weakest link for the Torque/Maui combination
Pre-GDB on Batch Systems (Bologna)11th March 2014 16
Outlook‣ Some future options‣ Change from Maui to Moab (but, it is not free!)‣ Setting up a kind of “OpenMaui” project within WLCG-
sites as a community effort to provide support and improvements to Maui
‣ Integrate with another scheduler. Which one?‣ Complete change to another system (SLURM, HTCondor,
…)‣ “Do nothing” until a real problem arrives‣ Currently, just a worry, no real problem detected so far in
PIC/NIKHEF‣ Improvements from migrating to another system unclear
Pre-GDB on Batch Systems (Bologna)11th March 2014 17
Outlook
‣ Questions:‣ If decided for WLCG sites to move away from
Torque/Maui, would it be feasible before the LHC Run2?‣ Migration to a new batch system requires time and effort,
thus manpower and expertise, in order to reach and adequate performance for a Grid site
‣ Not clear if needed before Run2
‣ What happens with sites shared with non-WLCG VOs?‣ Impact on other users (NIKHEF 45%)‣ For PIC, several disciplines rely on local job submissions. A
change on the batch system affects many users, and requires re-education, changes, and tests of their submission tools to adapt to an eventual new system