the ornl cluster computing experience…

21
Oak Ridge National Laboratory — U.S. Department of Energy 1 The ORNL Cluster Computing Experience… Stephen L. Scott Oak Ridge National Laboratory Computer Science and Mathematics Divisi Network and Cluster Computing Group December 6, 2004 RAMS Workshop Oak Ridge, TN [email protected] www.csm.ornl.gov/~sscott

Upload: ananda

Post on 20-Mar-2016

46 views

Category:

Documents


0 download

DESCRIPTION

The ORNL Cluster Computing Experience…. Stephen L. Scott Oak Ridge National Laboratory Computer Science and Mathematics Division Network and Cluster Computing Group December 6, 2004 RAMS Workshop Oak Ridge, TN. [email protected] www.csm.ornl.gov/~sscott. OSCAR. Step 7. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 1

The ORNL ClusterComputing Experience…

Stephen L. Scott

Oak Ridge National LaboratoryComputer Science and Mathematics Division

Network and Cluster Computing Group

December 6, 2004RAMS WorkshopOak Ridge, TN

[email protected] www.csm.ornl.gov/~sscott

Page 2: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 2

OSCAR

Page 3: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 3

What is OSCAR?• OSCAR Framework (cluster installation configuration and management)

– Remote installation facility– Small set of “core” components– Modular package & test facility– Package repositories

• Use “best known methods”– Leverage existing technology where possible

• Wizard based cluster software installation– Operating system– Cluster environment

• Administration• Operation

• Automatically configures cluster components• Increases consistency among cluster builds• Reduces time to build / install a cluster• Reduces need for expertise

Open Source Cluster Application Resources

Step 5

Step 8 Done!

Step 6

Step 1 Start…

Step 2

Step 3Step 4

Step 7

Page 4: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 4

OSCAR Components• Administration/Configuration

– SIS, C3, OPIUM, Kernel-Picker, NTPconfig cluster services (dhcp, nfs, ...)– Security: Pfilter, OpenSSH

• HPC Services/Tools– Parallel Libs: MPICH, LAM/MPI, PVM– Torque, Maui, OpenPBS– HDF5– Ganglia, Clumon, … [monitoring systems]– Other 3rd party OSCAR Packages

• Core Infrastructure/Management– System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher, – OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)

Page 5: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 5

Open Source Community Development Effort

• Open Cluster Group (OCG)– Informal group formed to make cluster computing more

practical for HPC research and development– Membership is open, direct by steering committee

• OCG working groups– OSCAR (core group)– Thin-OSCAR (Diskless Beowulf)– HA-OSCAR (High Availability)– SSS-OSCAR (Scalable Systems Software)– SSI-OSCAR (Single System Image)– BIO-OSCAR (Bioinformatics cluster system)

Page 6: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 6

OSCAR Core Partners

• Dell• IBM• Intel• Bald Guy Software• RevolutionLinux

• Indiana University• NCSA• Oak Ridge National

Laboratory• Université de Sherbrooke• Louisiana Tech Univ.

November 2004

Page 7: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 7

eXtreme TORC

eXtreme TORC powered by OSCAR• 65 Pentium IV Machines• Peak Performance: 129.7 GFLOPS• RAM memory: 50.152 GB

• Disk Capacity: 2.68 TB• Dual interconnects - Gigabit & Fast Ethernet

Page 8: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 8

Page 9: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 9

HA-OSCAR:

• The first known field-grade open source HA Beowulf cluster release

• Self-configuration Multi-head Beowulf system

• HA and HPC clustering techniques to enable critical HPC infrastructure

• Active/Hot Standby

• Self-healing with 3-5 sec automatic failover time

RAS Management for HPC cluster

Page 10: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 10

Page 11: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 11

Scalable Systems Software

Page 12: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 12

IBMCrayIntelSGI

Scalable Systems Software

Participating Organizations

ORNLANLLBNLPNNL

NCSAPSCSDSC

SNLLANLAmes

• Collectively (with industry) define standard interfaces between systems components for interoperability

• Create scalable, standardized management tools for efficiently running our large computing centers

Problem

Goals

Impact

• Computer centers use incompatible, ad hoc set of systems tools

• Present tools are not designed to scale to multi-Teraflop systems

• Reduced facility mgmt costs.• More effective use of machines

by scientific applications.

ResourceManagement

Accounting& user mgmt

SystemBuild &Configure

Job management

SystemMonitoring

www.scidac.org/ScalableSystemsTo learn more visit

Page 13: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 13

SSS-OSCARScalable System Software

Leverage OSCAR framework to package and distribute the Scalable System Software (SSS) suite, sss-oscar.

sss-oscar – A release of OSCAR containing all SSS software in single downloadable bundle.

SSS project developing standard interface for scalable toolsImprove interoperabilityImprove long-term usability & manageabilityReduce costs for supercomputing centers

Map out functional areasSchedulers, Job MangersSystem MonitorsAccounting & User managementCheckpoint/RestartBuild & Configuration systems

Standardize the system interfacesOpen forum of universities, labs, industry repsDefine component interfaces in XMLDevelop communication infrastructure

Page 14: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 14

OSCAR-ized SSS Components• Bamboo – Queue/Job Manager

• BLCR – Berkeley Checkpoint/Restart • Gold – Accounting & Allocation Management System• LAM/MPI (w/ BLCR) – Checkpoint/Restart enabled MPI• MAUI-SSS – Job Scheduler• SSSLib – SSS Communication library

– Includes: SD, EM, PM, BCM, NSM, NWI

• Warehouse – Distributed System Monitor• MPD2 – MPI Process Manager

Page 15: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 15

Cluster Power Tools

Page 16: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 16

C3 Power Tools• Command-line interface for cluster system

administration and parallel user tools.

• Parallel execution cexec – Execute across a single cluster or multiple clusters at same time

• Scatter/gather operations cpush/cget – Distribute or fetch files for all node(s)/cluster(s)

• Used throughout OSCAR and as underlying mechanism for tools like OPIUM’s useradd enhancements.

Page 17: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 17

C3 Building Blocks• System administration

• cpushimage - “push” image across cluster• cshutdown - Remote shutdown to reboot or halt cluster

• User & system tools• cpush - push single file -to- directory• crm - delete single file -to- directory• cget - retrieve files from each node• ckill - kill a process on each node• cexec - execute arbitrary command on each node

• cexecs – serial mode, useful for debugging• clist – list each cluster available and it’s type• cname – returns a node name from a given node position• cnum – returns a node position from a given node name

Page 18: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 18

C3 Power Tools

Example to run hostname on all nodes of default cluster:$ cexec hostname

Example to push an RPM to /tmp on the first 3 nodes$ cpush :1-3 helloworld-1.0.i386.rpm /tmp

Example to get a file from node1 and nodes 3-6$ cget :1,3-6 /tmp/results.dat /tmp

* Can leave off the destination with cget and will use the same location as source.

Page 19: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 19

Motivation for Success!

Page 20: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 20

RAMS – Summer 2004

Page 21: The ORNL Cluster Computing Experience…

Oak Ridge National Laboratory — U.S. Department of Energy 21

Preparation for Success!

Personality & Attitude

• Adventurous• Self starter• Self learner• Dedication• Willing to work long hours• Able to manage time• Willing to fail• Work experience• Responsible• Mature personal and

professional behavior

Academic

• Minimum of Sophomore standing

• CS major• Above average GPA• Extremely high faculty

recommendations• Good communication skills• Two or more programming

languages• Data structures• Software engineering