bright cluster manager: a comprehensive, integrated management solution for parallel universes today...

22
Ian Lumb Bright Evangelist Bright Cluster Manager A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Upload: ian-lumb

Post on 17-Jun-2015

221 views

Category:

Technology


2 download

DESCRIPTION

Fast on the uptake, Bright Cluster Manager received the Intel Cluster Ready Voyager Award in 2009. The synergy between Intel’s program and Bright’s software continues to this day. With the introduction of general-availability support for the Intel Xeon Phi in Bright Cluster Manager 6.1, Intel and Bright are already taking the initiative to broaden and deepen this program. At the highest level, Bright 6.1 allows stand-alone Intel Xeon Phi coprocessors to be provisioned, monitored and managed - thus extending what Bright has been able to do for Intel CPUs for some time. Bright support for this new coprocessor from Intel includes recompilation of the appropriate driver against the kernel at boot time, comprehensive networking plus various tailored metrics. Even more interesting, however, is Bright’s ability to incorporate individual Intel Xeon Phi coprocessors as bona fide devices from the more holistic perspective of the cluster. As first-class devices, the Intel Xeon Phi coprocessors are monitored and managed on an ongoing basis via the command-line and GUI interfaces in Bright, while also being accessible via the Bright APIs. Because this latest version of Bright also includes enhancements in the area of workload management, organizations can already take advantage of various execution models presented by the hybrid Intel Xeon - Intel Xeon Phi environment - from native to offload modes. Execution-model support targeted specifically to this hybrid CPU-coprocessor based architecture complements existing support that provides a problem-free environment for running compute jobs via Bright's Cluster Health Framework. Overall Bright Cluster Manager 6.1 literally delivers a turnkey capability to organizations that seek to rapidly integrate, and get results from, the introduction of the Intel Xeon Phi into their HPC environments. In this presentation, a brief technical overview of support for the Intel Xeon Phi via Bright Cluster Manager 6.1 will be provided.

TRANSCRIPT

Page 1: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Ian Lumb Bright Evangelist

Bright Cluster ManagerA Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Page 2: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 3: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

In My Parallel Universe …In My Parallel Universe …

In my parallel universe, parallel computing at extreme scale is easy! • Scientists focus on science, engineers on engineering

No problem is out of computational reach Coding has been deprecated!

– Problems are stated in the natural language of the discipline » Implementation suggestions/guidelines are optional

– `Heuristic algorithms’ take care of the implementation specifics (i.e., the coding)

Resources are plentiful!– Physical constraints (e.g., power, cooling & space) have been

eliminated– Generic processors to specialized coprocessors are readily

available – Resource management is completely transparent

Page 4: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Parallel Computing via Bright Cluster ManagerParallel Computing via Bright Cluster Manager

Provisions, monitors and manages all neo-heterogeneous resources• Systems, storage, interconnects, etc.

Management, parallelized• Adaptive provisioning in real time • Topologically based monitoring• Fault tolerance via high availability • One GUI for multiple clusters and clouds

Development simplified • Tools and libraries available • Workloads managed

Page 5: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Bright Cluster

ArchitectureArchitecture

CMDaemon

SOAP+SSL

SOAP+SSL

ClusterManagemen

tGUI

ClusterManagement

Shell

Web-BasedUser Portal

Third-PartyApplications

head node

node001

node003

node002

Page 6: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Cluster Management GUI

Cluster Management Shell

User Portal

SSL / SOAP / X509 / IPtables

Cluster Management Daemon

SLES / RHEL / CentOS / SL

Bright Cluster Manager — ElementsBright Cluster Manager — Elements

SLES / RHEL / CentOS / SL

ScaleMP vSMP

Provisioning

MonitoringAutomation

Health Checks

Management

SLURMTorque/MauiTorque/MOAB

PBS ProGrid Engine

LSF

CompilersLibraries

DebuggersProfilers

CP

U

MIC

Mem

ory

PD

U

IPM

I/iL

O

Inte

rcon

nect

Eth

ern

et

Dis

k

Page 7: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Management InterfaceManagement Interface

Graphical User Interface (GUI) Offers administrator full cluster control Standalone desktop application Manages multiple clusters simultaneously Runs natively on Linux, Windows and MacOS

Cluster Management Shell (CMSH) All GUI functionality also available through

Cluster Management Shell Interactive and scriptable in batch mode

ClusterManagement

GUI

ClusterManagement

Shell

Page 8: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 9: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Intel Xeon Phi IntegrationIntel Xeon Phi Integration

Everything needed to enable Xeon Phi on a cluster is packaged as easy-to-install Bright packages:• Xeon Phi driver • Xeon Phi runtime• Xeon Phi SDK• Xeon Phi OFED• Xeon Phi flash utilities

Environment modules ensure that user environment is set up perfectly (PATH, LD_LIBRARY_PATH, ...)

Xeon Phi driver recompiled automatically against running kernel at boot-time

Page 10: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Intel Xeon Phi IntegrationIntel Xeon Phi Integration

Set-up wizard takes care of initial Xeon Phi configuration (e.g. creating bridge interfaces, assigning IP addresses)

Xeon Phi appears as a first-class device type in cluster management infrastructure

Xeon Phi can be configured, controlled and monitored through CMSH and CMGUI

Xeon Phi is automatically added to the workload management system as a consumable resource

Compute jobs may request Xeon Phi resource in job script

Page 11: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

11

Page 12: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 13: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 14: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 15: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 16: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Bright Cluster

Architecture — MonitoringArchitecture — Monitoring

CMDaemon

metrics

data

ClusterManagemen

tGUI

ClusterManagement

Shell

Web-BasedUser Portal

Third-PartyApplications

head node

node001

node003

node002

metrics

metrics

metrics

metrics

raw data consolidated data

BMC

BMC

BMC

Page 17: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow
Page 18: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Cluster Health ManagementCluster Health Management

Goal: provide problem free environment for running jobs Regular health checks

• Actions that return PASS, FAIL or UNKNOWN• Can be associated with a settable severity and a message• Can launch an action based on any response value

Pre-job health checks 16 Xeon Phi health checks included by default Jobs will only be scheduled to nodes where Xeon Phi is

working properly (as determined by health checks) Intel Cluster Checker included to verify that cluster is set

up properly

Page 19: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Intel Xeon Phi Workload ManagementIntel Xeon Phi Workload Management

Three ways to run Xeon Phi jobs:• Offload (i.e. Xeon Phi is used as coprocessor from host)• Native (i.e. job executes entirely on Xeon Phi)• Symmetric (i.e. communicating processes on both host and

Xeon Phi) Offload: Xeon Phi represented as consumable resource in

workload management system Native: Ported Slurm to Xeon Phi Symmetric: work in progress, will require some changes

to workload managers Additional work in progress: make sure Xeon Phi is not

used in multiple modes simultaneously

Page 20: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Cherry Creek

Page 21: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Bright Cluster Manager makes it easy to install, manage and use clusters with Intel Xeon

Phi coprocessors.

Page 22: Bright Cluster Manager: A Comprehensive, Integrated Management Solution for Parallel Universes Today and Tomorrow

Questions?