xt9? integrating and operating a conjoined xt4+xt5 system...3 managed by ut-battelle for the...

19
Managed by UT-Battelle for the Department of Energy XT9? Integrating and Operating a Conjoined XT4+XT5 System presented by Don Maxwell HPC Systems ORNL

Upload: others

Post on 11-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

Managed by UT-Battellefor the Department of Energy

XT9? Integrating and Operating a

Conjoined XT4+XT5 System

presented by

Don Maxwell

HPC Systems

ORNL

Page 2: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

2 Managed by UT-Battellefor the Department of Energy

What is a Conjoined XT4+XT5?

Jaguar XT4

Jaguar XT5

MOAB on Cray XT

Page 3: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

3 Managed by UT-Battellefor the Department of Energy

What is a Conjoined XT4+XT5?

Jaguar XT5 Jaguar XT4

Cabinets 200 84

Processors AMD Opteron

2.3 GHz quad-core

AMD Opteron

2.1 GHz quad-core

Compute Cores 149,504 31,328

Memory (TB) 300 62

Links 115,200 48,384

Theoretical Peak

Performance (TFLOPS/s)

1,375 263

I/O Capacity (TB) 4,100* 700

I/O Bandwidth (GB/s) 100* 40

Service Nodes 256 116

* The current filesystem on Jaguar XT5 is an Infiniband direct-attached configuration using roughly half of the available storage capacity available. The other half is being used for development of a Lustre routed filesystem called Spider. The two halves will be merged into a Spider configuration which will be mounted center wide during the next few months.

MOAB on Cray XT

Page 4: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

4 Managed by UT-Battellefor the Department of Energy

What is a Conjoined

XT4+XT5?

Combining two resources into one

SION

External Logins

Need a platform for access to both machines

MOAB on Cray XT

Jaguar XT4 Jaguar XT5

Cisco IB

Core 1

Cisco IB

Core 2

Cisco IB

Aggregation

Cisco IB

2nd Floor

External Logins

Page 5: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

5 Managed by UT-Battellefor the Department of Energy

Routing XT Computes

XT Compute Node Routes

192 IB nodes XT5

48 IB nodes XT4

IB Router <-> IB Router Selection based on IB switch

Compute node router selection based on distance

MOAB on Cray XT

SION

Jaguar XT4

Computes

Cisco IB

Core 1

Cisco IB

Core 2

XT4

Service

Nodes

XT5

Service

Nodes

Jaguar XT5

Computes

Page 6: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

6 Managed by UT-Battellefor the Department of Energy

External Login Nodes

Motivation

– Single platform for accessing both XTs

– To provide a much more capable platform for software development than the current service nodes directly attached to the XTs

Prototype Hardware

– Quad socket AMD Opteron 2.0 GHz quad-core

– 32 GB memory

– SLES 10.2

– Autoyast

– Cfengine

– Conserver

MOAB on Cray XT

Page 7: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

7 Managed by UT-Battellefor the Department of Energy

External Login Nodes

XT Software

– Batch Systems

– Filesystems

– Cray XT Stack

MOAB on Cray XT

Page 8: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

8 Managed by UT-Battellefor the Department of Energy

Batch

Moab/TORQUE

– History dating back to 2005

– First port to XT platform on ORNL development system

– Requirements discussion in December for conjoined project

Two potential development paths

– Modify existing XT native resource manager

– Use grid model

Modifying existing RM seemed to be the easiest path

MOAB on Cray XT

Page 9: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

9 Managed by UT-Battellefor the Department of Energy

Moab features support NCCS mission

Job templates to categorize job sizes

– Large jobs favored to support capability mission– DOE metrics requirement for Capability Usage

In the first year following general availability of a new or upgraded system, 35% of the CPU time used on the system will be accumulated by jobs using 20% or more of the available processors

In subsequent years, 30% of the CPU time used on the system will be accumulated by jobs using 30% or more of the available processors

Supported through use of Moab job templates/fairshare/priorities

Identity manager to import project priorities

– RATS maintains project information

– Priorities changed dynamically via import from ASCII file

Size 0 jobs eliminate need for user cron jobs

– Cron can causes issues with filesystem unmounts

Batch control more desirable

– Accounting method same as traditional batch jobs

LENS Visualization cluster job pre-emption

– 32 nodes with each node containing four quad-core 2.3 GHz AMD Opteron processors with 64 GB of memory, and 2 NVIDIA 8800 GTX GPUs

– Computational jobs allowed unless an analysis job appears

MOAB on Cray XT

Page 10: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

10 Managed by UT-Battellefor the Department of Energy

Batch

What’s the model?

ALPS only has knowledge of one XT/domain

Passwordless ssh using sudo for communication

External Moab allows each XT to operate independently

MOAB on Cray XT

Jaguar XT4 Jaguar XT5

External Logins

ALPS

TORQUE

Moab

External Server

ALPS

TORQUE

Page 11: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

11 Managed by UT-Battellefor the Department of Energy

Batch

Features

– Target a particular resource

qsub

msub -l partition=(xt4|xt5)

– No specific resource

msub

Load balancer

– Simple algorithm based purely on availability of resources at the time of job launch

– Open to more sophisticated algorithm

Delay choice until runtime

Queue depth

Historical utilization

– Restrict each partition based on user

– Direct jobs based on size using job templates

MOAB on Cray XT

Page 12: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

12 Managed by UT-Battellefor the Department of Energy

Filesystems

Production

– 3 Fibre-channel Lustre filesystems on XT4

150TB spans first half of DDN 9550s

150TB spans second half of DDN 9550s

300TB spans all DDN 9550s

– 1 Infiniband direct-attached 4.5PB Lustre filesystem on XT5

How do I mount these filesystems on external login nodes?

Answer: Not easily

MOAB on Cray XT

Page 13: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

13 Managed by UT-Battellefor the Department of Energy

Filesystems

Method

– LNET routing via SION

Advantages

– Users have same filesystems available to them on external login nodes

However…

– Using XTs as Lustre file servers is a bad idea

Hangs for users accessing filesystems

– Users have to compile for multiple filesystems if allowing the system to choose the partition

LMON

– Script to monitor health of filesystems

– Lctl ping mds to detect state

– umount problems

/etc/mtab locking issues

MOAB on Cray XT

Page 14: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

14 Managed by UT-Battellefor the Department of Energy MOAB on Cray XT

Jaguar XT4 Jaguar XT5

SION

External Logins

Page 15: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

15 Managed by UT-Battellefor the Department of Energy

Cray XT software

Same versions of XT software must be available on external logins

Method

– xt-rpm utility

External NFS Sharedroot for Cray XT software

/opt/xt* links back to External NFS Sharedroot

Separate RPM database

Default programming environment for both XTs same

– Software packages per machine can vary

MOAB on Cray XT

Page 16: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

16 Managed by UT-Battellefor the Department of Energy

XT Modules

Module named XT4 or XT5 will be loaded as a key to determine which machine is being addressed

XT-specific commands such as apstat, xtnodestats, etc. will be wrapped based on XT module

Lustre scratch directory /tmp/work/$USER changes based on XT module

Provides TORQUE environment

MOAB on Cray XT

Page 17: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

17 Managed by UT-Battellefor the Department of Energy

Status

Prototype up and working

– External login node up with SLES 10.2

– Using XT5 TDS/XT4 TDS for XTs

– Cray software installed and communication working with XTs using XT[45] modules

– Local Lustre filesystems from each XT mounted

– Single scheduler running on external server

4 External Logins in testing for Jaguar with SLES 10.2

– Local Lustre filesystems from XT4/XT5 mounted

– LMON hardening

– Moab policy review for final configuration underway

MOAB on Cray XT

Page 18: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

18 Managed by UT-Battellefor the Department of Energy

XT9?

Futures

– Filesystems

Spiders everywhere

– More sophisticated Moab load-balancing algorithm

– Moab priorities based on fairshare force Grid model?

– Cray software is multi-XT aware

– Spanning machines

Moab can span partitions using a QOS with SPAN feature

Requires OpenMPI or another MPI derivate

MOAB on Cray XT

Page 19: XT9? Integrating and Operating a Conjoined XT4+XT5 System...3 Managed by UT-Battelle for the Department of Energy What is a Conjoined XT4+XT5? Jaguar XT5 Jaguar XT4 Cabinets 200 84

19 Managed by UT-Battellefor the Department of Energy

Questions?

MOAB on Cray XT