open source cluster applications resources. overview what is o.s.c.a.r.? history installation...

35
Open Source Cluster Applications Resources

Upload: janel-gordon

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Open Source Cluster Applications Resources

OverviewWhat is O.S.C.A.R.?HistoryInstallationOperationSpin-offsConclusions

HistoryCCDK (Community Cluster Development Kit)OCG (Open Cluster Group)OSCAR (the Open Source Cluster Application

Resource)IBM, Dell, SGI and Intel working closely

togetherORNL – Oak Ridge National Laboratory

First MeetingTim Mattson and Stephen ScottDecided on these:

That the adoption of clusters for mainstream, high-performance computing is inhibited by a lack of well-accepted software stacks that are robust and easy to use by the general user.

That the group embraces the open-source model of software distribution. Anything contributed to the group must be freely distributable, preferably as source code under the Berkeley open-source license.

That the group can accomplish its goals by propagating best-known practices built up through many years of hard work by cluster computing pioneers.

Initial ThoughtsDiffering architectures (small, medium, large)Two paths of progress, R&D and ease of usePrimarily for non-computer-savvy users.

ScientistsAcademics

Homogeneous system

TimelineInitial meeting in 2000Beta development started the same yearFirst distribution, OSCAR 1.0 in 2001 at

LinuxWorld Expo in New York CityToday up to OSCAR 5.1

Heterogeneous systemFar more robustMore user friendly

Supported Distributions – 5.0Distribution and Release

Architecture Status

Red Hat Enterprise Linux 4

x86 Fully supported

Red Hat Enterprise Linux 4

x86_64 Fully supported

Red Hat Enterprise Linux 4

ia64 Fully supported

Fedora Core 4 x86 Fully supported

Fedora Core 4 x86_64 Fully supported

Fedora Core 5 x86 Fully supported

Fedora Core 5 x86_64 Fully supported

Mandriva Linux 2006 x86 Fully supported

SUSE Linux 10.0 x86 Fully supported

InstallationDetailed Installation notesDetailed User guideBasic idea:

Configure head node (server)Configure image for client nodesConfigure networkDistribute node imagesManage your own cluster!!

Head NodeInstall by running ./install_cluster eth1 scriptGUI will auto-launch Chose desired step in GUI, make sure each

step is complete before proceeding onto next one

All the configuration can be done from this system from now on

DownloadSubversion is usedDefault is the OSCAR SVNCan set up custom SVNAllows for up to date

installationAllows for controlled

rollouts of multiple clustersOPD also has powerful

command line functionality (LWP for proxy servers)

Select & Configure OSCAR packagesCustomize server up to your

liking/needsSome packages can be

customizedThis step is very crucial,

choice of packages can affect performance as well as compatibility

Installation of Server NodeSimply installs packages which were selectedAutomatically configures the server nodeNow the Head or Server is ready to manage,

administer and schedule jobs for it’s client nodes

Build Client ImageChoose nameSpecify packages within the package fileSpecify distributionBe wary of automatic reboot if network boot

is manually selected as default

Building the Client Image …

Define ClientsThis step creates the network structure of the

nodesIt’s advisable to assign IP based on physical

linksGUI short-comings regarding multiple IP

spansIncorrect setup can lead to an error during

node installation

Define Clients

Setup NetworkingSIS – System Installation SuiteSystemImagerMAC addresses are scanned forMust link a MAC to a nodeMust select network boot method (rsync,

multicast, bt)Must make sure clients support PXE boot or

create boot CDsOwn Kernel can be used if the one supplied

with SIS does not work

Client Installation and TestAfter the network is properly configured,

installation can beginAll nodes are installed and rebootedOnce the system imaging is complete, a test

can be run to ensure the cluster is working properly

At this point, the cluster is ready to begin parallel job scheduling

OperationAdmin packages are:

Torque Resource Manager Maui Scheduler C3pfilterSystem Imager Suite Switcher Environment ManagerOPIUMGanglia

OperationLibrary packages:

LAM/MPIOpenMPIMPICHPVM

Torque Resource ManagerServer on Head node“mom” daemon on clientsHandles job submission and executionKeeps track of cluster resourcesHas own scheduler but uses Maui by defaultCommands are not intuitive, documentation

must be readFrom OpenPBShttp://svn.oscar.openclustergroup.org/wiki/os

car:5.1:administration_guide:ch4.1.1_torque_overview

Maui SchedulerHandles job schedulingSophisticated algorithmsCustomizableMuch literature on it’s algorithmsHas a commercial gen. of Maui called MoabAccepted as the unofficial HPC standard for

schedulinghttp://www.clusterresources.com/pages/resou

rces/documentation.php

C3 - Cluster Command Control Developed by ORNLCollection of tools for cluster administrationCommands:

cget, cpush, crm, cpushimagecexec, cexecs, ckill, cshutdowncnum, cname, clist

Cluster Configuration Fileshttp://svn.oscar.openclustergroup.org/wiki/

oscar:5.1:administration_guide:ch4.3.1_c3_overview

pfilterCluster traffic filterDefault is that client nodes can only send

outgoing communications, outside the scope of the cluster

If it is desirable to open up client nodes, pfilter config file must be modified

System Imager SuiteTool for network Linux installationsImage based, can even chroot into imageAlso has database which contains cluster

configuration informationTied in with C3Can handle multiple images per clusterCompletely automated once image is createdhttp://wiki.systemimager.org/index.php/

Main_Page

Switcher Environment Manager Handles “dot” filesDoes not limit advanced usersDesigned to help non-savvy usersHas guards in place that prevent system

destructionWhich MPI to use – per user basisOperates on two levels: user and systemModules package is included for advanced

users (and used by switcher)

OPIUMLogin is handled by the Head nodeOnce connection is established, client nodes

do not require authenticationSynchronization run by root, at intervalsIt stores hash values of the password in .shh

folder along with a “salt”Password changes must be done at the Head

node as all changes propagate from there

GangliaDistributed Monitoring SystemLow overhead per nodeXML for data representationRobustUsed in most cluster and grid solutionshttp://ganglia.info/papers/science.pdf

LAM/MPILAM - Local Area MulticomputerLAM initializes the runtime environment on a

select number of nodesMPI 1 and some of MPI 2MPICH2 can be used if installedTwo tiered debugging system exists:

snapshot and communication logDaemon basedhttp://www.lam-mpi.org/

Open MPIReplacement for LAM/MPISame team working on itLAM/MPI relegated to upkeep only, all new

development in Open MPIMuch more robust (OS, schedulers)Full MPI-2 complianceMuch higher performancehttp://www.open-mpi.org/

PVM – Parallel Virtual MachineSame as LAM/MPICan be run outside of the scope of Torque

and MauiSupports Windows nodes as wellMuch better portabilityNot as robust and powerful as Open MPIhttp://www.csm.ornl.gov/pvm/

Spin-offsHA-OSCAR - http://xcr.cenit.latech.edu/ha-

oscar/VMware with OSCAR -

http://www.vmware.com/vmtn/appliances/directory/341

SSI-OSCAR - http://ssi-oscar.gforge.inria.fr/SSS-OSCAR -

http://www.csm.ornl.gov/oscar/sss/

ConclusionsFuture DirectionOpen MPIWindows, Mac OS?