status of bes -...
TRANSCRIPT
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
2 2016/9/23
Storage Stuck Issue Solved • Storage service unstable after the summer maintenance • Storage log mining work
– Collect and analyze the key words from storage File System logs
– Focus on the core switch provided by Ruijie.
• File System stuck issue disappeared after replacement of the core Ethernet switch ( Aug. 29th), Storage system is more stable now.
2016/9/23 3
Internal error number
Internal error number
• Ruijie Network, the switch vendor gives a preliminary conclusion: the chipset level monitoring results show that there are traffic load balancing algorithm problems.
New Storage Device
• 3.6PB storage device will be purchased by the end of this year.
• 2.5PB (available space) will be added.
2016/9/23 4
Cloud Computing 1/2 • Support multiple batch systems: PBS/Torque, HTCondor. • Dynamic VM provision: vms are created and destroyed on demand. • Fair-share algorithm: guarantee resources are equally distributed among
different experiments.
2016/9/23 6
Resource minimum
limit
Job queued, automatically
create virtual machines
Cloud Computing 2/2
• Based on OpenStack Kilo
• Two kinds of cloud services
• Infrastructure As A Service
– 14 compute nodes – 352 virtual cores
224 cores used, 169 virtual machines running
– User Oriented Self Service
• Virtual Computing Cluster
– 28 compute nodes - 672 cores
– Provide virtual machines on demand of real
computing requirements
– Transparent to users
2016/9/23 7
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
8 2016/9/23
Migration from PBS to HTCondor 1/3
• New architecture • Central management
• Integrated with monitoring
• New Monitor tool
• Easy to expand
• HTCondor has supported JUNO and CMS for more than 1 year • High job scheduler performance
• Running stable
• 1000 BES CPU cores have been migrated to HTCondor cluster from PBS in Aug.
2016/9/23 9
Migration from PBS to HTCondor 2/3
• HTCondor has been tested and improved according to users’ feedback.
• HTCondor optimization
• Keep almost the same way for users to manage jobs.
• User manual is ready.
• New share policy
• BES can have more CPU resources from other experiments to fit the peak requirement.
2016/9/23 10
Migration from PBS to HTCondor 3/3
2016/9/23 11
• Next
• 2000 CPU resources will be migrated to HTCondor in Oct.
• All BES resources will be migrated to HTCondor by the end of this year.
• User training will be held in Oct.
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
12 2016/9/23
BESIII Distributed Computing 1/3 • During August summer maintenance, DIRAC server has been successfully
upgraded from v6r13 to v6r15.
– Prepare to support multi-core jobs in the near future.
– VMDirac has been upgraded to 2.0, which greatly simplifies the procedure to adopt new cloud sites.
• New monitoring system has been put into production, which gives a clear view of real-time site status.
2016/9/23 13
BESIII Distributed Computing 2/3 • The 2th BESIIICGEM Cloud Computing Summer School has been successfully held
in July, Shandong University.
• About 30 people joined the school. – Teachers from INFN, IHEP, Zhejiang University, 99cloud. – Students from IHEP, SDU, JINR, Soochow, USTC.
• The summer school has greatly helped students gain knowledge of cloud and know how to use cloud for Physics analysis.
– Help push forward cloud applications in HEP.
2016/9/23 14
BESIII Distributed Computing 3/3
• During the last three months, about 224K BESIII jobs have been done on the platform. – 11 sites join the production
– ~40% from the UMN site
• Total data exchange among sites are about 68.8TB.
• About 70 user tasks have been done.
• BESIII distributed computing system keeps stable this season
• Multi-core support is on the way to meet future challenge
2016/9/23 15
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
16 2016/9/23
Distributed Monitoring System 1/2
• Motivation
• Many remote sites are short of man power to do maintenance work.
• IHEP can help on routine maintenance for the remote sites.
• Distributed monitoring is the cornerstone.
• Migrate monitoring server from Icinga to Nagios.
– Better support for distributed monitoring architecture .
• Enforced system security.
– Open port 80 of monitoring server for outer network.
– Update Apache from v2.2 to v2.4.
– Check system vulnerabilities regularly.
2016/9/23 17
Distributed Monitoring System 2/2
• Chengdu Site
• Monitor all hosts of Chengdu site from the central site (IHEP).
• HTCondor
• Monitor HTCondor Service of all the computing servers.
2016/9/23 18
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
19 2016/9/23
High Performance Cluster 1/2 • A new heterogeneous hardware platform : CPU, Intel Xeon Phi, GPU
• Parallel programming supports : MPI, OpenMP, CUDA, OpenCL …
• Potential usage cases : simulation, partial wave analysis …
2016/9/23 20
700TB (EOS/NFS)
150 GPU Cards 1000 CPU Cores 50 Xeon Phi Cards
Remote
Sites Users
Login Farm Job
Mellanox SX6012
Brocade
8770
SX6036
InfiniBand FDR 56Gb/s
Ethernet 10/40 Gb/s
112 Gb/s 80 Gb/s
Job
Job
Job
Job
Job
Job
High Performance Cluster 2/2
• SLURM as the scheduler.
• Test bed is ready: version 16.05
– Virtual machines: 1 control node, 4 computing nodes.
– Physical servers: 1 control node, 26 computing nodes(2 GPU servers included).
• Undergoing scheduler evaluation.
– Two scheduler algorithms evaluated: sched/backfill, sched/builtin.
– Undergoing integration with DIRAC.
• Network architecture & technologies – InfiniBand network for HPC test bed is already built.
2016/9/23 21
Outline
• Current Status
• Migration from PBS to HTCondor
• Distributed Computing
• Distributed Monitoring
• High Performance Cluster
• Next Step
22 2016/9/23
Next Step
• ~2.5PB (available storage) will be added for BES.
• Migration from PBS to HTCondor will be finished by the end of this year.
• IHEP will provide routine maintenance service to more remote sites.
• HPC cluster will be provided next year.
2016/9/23 23
Conclusion
• Computing platform for BESIII is more stable after the core switch replacement.
• Optimized HTCondor cluster can satisfy BES computing requirements.
• Distributed monitoring system provides maintenance service to remote BES sites.
• HPC test bed has already been built and is now under evaluation.
2016/9/23 24