bright cluster manager: a comprehensive, integrated management solution for parallel universes today...
DESCRIPTION
Fast on the uptake, Bright Cluster Manager received the Intel Cluster Ready Voyager Award in 2009. The synergy between Intel’s program and Bright’s software continues to this day. With the introduction of general-availability support for the Intel Xeon Phi in Bright Cluster Manager 6.1, Intel and Bright are already taking the initiative to broaden and deepen this program. At the highest level, Bright 6.1 allows stand-alone Intel Xeon Phi coprocessors to be provisioned, monitored and managed - thus extending what Bright has been able to do for Intel CPUs for some time. Bright support for this new coprocessor from Intel includes recompilation of the appropriate driver against the kernel at boot time, comprehensive networking plus various tailored metrics. Even more interesting, however, is Bright’s ability to incorporate individual Intel Xeon Phi coprocessors as bona fide devices from the more holistic perspective of the cluster. As first-class devices, the Intel Xeon Phi coprocessors are monitored and managed on an ongoing basis via the command-line and GUI interfaces in Bright, while also being accessible via the Bright APIs. Because this latest version of Bright also includes enhancements in the area of workload management, organizations can already take advantage of various execution models presented by the hybrid Intel Xeon - Intel Xeon Phi environment - from native to offload modes. Execution-model support targeted specifically to this hybrid CPU-coprocessor based architecture complements existing support that provides a problem-free environment for running compute jobs via Bright's Cluster Health Framework. Overall Bright Cluster Manager 6.1 literally delivers a turnkey capability to organizations that seek to rapidly integrate, and get results from, the introduction of the Intel Xeon Phi into their HPC environments. In this presentation, a brief technical overview of support for the Intel Xeon Phi via Bright Cluster Manager 6.1 will be provided.TRANSCRIPT
Ian Lumb Bright Evangelist
Bright Cluster ManagerA Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
In My Parallel Universe …In My Parallel Universe …
In my parallel universe, parallel computing at extreme scale is easy! • Scientists focus on science, engineers on engineering
No problem is out of computational reach Coding has been deprecated!
– Problems are stated in the natural language of the discipline » Implementation suggestions/guidelines are optional
– `Heuristic algorithms’ take care of the implementation specifics (i.e., the coding)
Resources are plentiful!– Physical constraints (e.g., power, cooling & space) have been
eliminated– Generic processors to specialized coprocessors are readily
available – Resource management is completely transparent
Parallel Computing via Bright Cluster ManagerParallel Computing via Bright Cluster Manager
Provisions, monitors and manages all neo-heterogeneous resources• Systems, storage, interconnects, etc.
Management, parallelized• Adaptive provisioning in real time • Topologically based monitoring• Fault tolerance via high availability • One GUI for multiple clusters and clouds
Development simplified • Tools and libraries available • Workloads managed
Bright Cluster
ArchitectureArchitecture
CMDaemon
SOAP+SSL
SOAP+SSL
ClusterManagemen
tGUI
ClusterManagement
Shell
Web-BasedUser Portal
Third-PartyApplications
head node
node001
node003
node002
Cluster Management GUI
Cluster Management Shell
User Portal
SSL / SOAP / X509 / IPtables
Cluster Management Daemon
SLES / RHEL / CentOS / SL
Bright Cluster Manager — ElementsBright Cluster Manager — Elements
SLES / RHEL / CentOS / SL
ScaleMP vSMP
Provisioning
MonitoringAutomation
Health Checks
Management
SLURMTorque/MauiTorque/MOAB
PBS ProGrid Engine
LSF
CompilersLibraries
DebuggersProfilers
CP
U
MIC
Mem
ory
PD
U
IPM
I/iL
O
Inte
rcon
nect
Eth
ern
et
Dis
k
Management InterfaceManagement Interface
Graphical User Interface (GUI) Offers administrator full cluster control Standalone desktop application Manages multiple clusters simultaneously Runs natively on Linux, Windows and MacOS
Cluster Management Shell (CMSH) All GUI functionality also available through
Cluster Management Shell Interactive and scriptable in batch mode
ClusterManagement
GUI
ClusterManagement
Shell
Intel Xeon Phi IntegrationIntel Xeon Phi Integration
Everything needed to enable Xeon Phi on a cluster is packaged as easy-to-install Bright packages:• Xeon Phi driver • Xeon Phi runtime• Xeon Phi SDK• Xeon Phi OFED• Xeon Phi flash utilities
Environment modules ensure that user environment is set up perfectly (PATH, LD_LIBRARY_PATH, ...)
Xeon Phi driver recompiled automatically against running kernel at boot-time
Intel Xeon Phi IntegrationIntel Xeon Phi Integration
Set-up wizard takes care of initial Xeon Phi configuration (e.g. creating bridge interfaces, assigning IP addresses)
Xeon Phi appears as a first-class device type in cluster management infrastructure
Xeon Phi can be configured, controlled and monitored through CMSH and CMGUI
Xeon Phi is automatically added to the workload management system as a consumable resource
Compute jobs may request Xeon Phi resource in job script
11
Bright Cluster
Architecture — MonitoringArchitecture — Monitoring
CMDaemon
metrics
data
ClusterManagemen
tGUI
ClusterManagement
Shell
Web-BasedUser Portal
Third-PartyApplications
head node
node001
node003
node002
metrics
metrics
metrics
metrics
raw data consolidated data
BMC
BMC
BMC
Cluster Health ManagementCluster Health Management
Goal: provide problem free environment for running jobs Regular health checks
• Actions that return PASS, FAIL or UNKNOWN• Can be associated with a settable severity and a message• Can launch an action based on any response value
Pre-job health checks 16 Xeon Phi health checks included by default Jobs will only be scheduled to nodes where Xeon Phi is
working properly (as determined by health checks) Intel Cluster Checker included to verify that cluster is set
up properly
Intel Xeon Phi Workload ManagementIntel Xeon Phi Workload Management
Three ways to run Xeon Phi jobs:• Offload (i.e. Xeon Phi is used as coprocessor from host)• Native (i.e. job executes entirely on Xeon Phi)• Symmetric (i.e. communicating processes on both host and
Xeon Phi) Offload: Xeon Phi represented as consumable resource in
workload management system Native: Ported Slurm to Xeon Phi Symmetric: work in progress, will require some changes
to workload managers Additional work in progress: make sure Xeon Phi is not
used in multiple modes simultaneously
Cherry Creek
Bright Cluster Manager makes it easy to install, manage and use clusters with Intel Xeon
Phi coprocessors.
Questions?