sun constellation linux cluster

5
Sun Constellation Linux Cluster: Ranger System Name: Ranger Host Name: ranger.tacc.utexas.edu IP Address: 129.114.50.163 Operating System: Linux Number of Nodes: 3,936 Number of Processing Cores: 62,976 Total Memory: 123TB Peak Performance: 579.4TFlops Total Disk: 1.73PB (shared) 31.4TB (local)

Upload: myfriend2me

Post on 16-Nov-2014

194 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Sun Constellation Linux Cluster

Sun Constellation Linux Cluster: Ranger

System Name: Ranger

Host Name: ranger.tacc.utexas.edu

IP Address: 129.114.50.163

Operating System: Linux

Number of Nodes: 3,936

Number of Processing Cores: 62,976

Total Memory: 123TB

Peak Performance: 579.4TFlops

Total Disk: 1.73PB (shared) 31.4TB (local)

Page 2: Sun Constellation Linux Cluster

System Overview

The Sun Constellation Linux Cluster, named Ranger, is one of the largest computational resources in the world.  Ranger was made possible by a grant awarded by the National Science Foundation in September 2006 to TACC and its partners including Sun  Microsystems, Arizona State University, and Cornell University. Ranger entered formal production on February 4, 2008 and supports high-end computational science for NSF TeraGrid researchers throughout the United States, academic institutions within Texas, and components of the University of Texas System.

The Ranger system is comprised of 3,936 16-way SMP compute nodes providing 15,744 AMD Opteron™ processors for a total of 62,976 compute cores, 123 TB of total memory and 1.7 PB of raw global disk space. It has a theoretical peak performance of 579 TFLOPS. All Ranger nodes are interconnected using InfiniBand technology in a full-CLOS topology providing a 1GB/sec point-to-point bandwidth. A 10 PB capacity archival system is available for long term storage and backups. Example pictures highlighting various components of the system are shown in Figures 1-3.

 

Figure 2. SunBlade x6420 motherboard (compute blade).

Figure 1. One of six Ranger rows: Management/IO Racks (front, black), Compute Rack (silver), and In-row Heat Exchanger (black).

Figure 3. InfiniBand Constellation Core Switch.

 

Architecture

The Ranger compute and login nodes run a Linux OS and are managed by the Rocks 4.1 cluster toolkit. Two 3456 port Constellation switches provide dual-plane access between NEMs (Network Element Modules) of each 12-blade chassis. Several global, parallel Lustre file systems have been configured to target different storage needs. The configuration and features for the compute nodes, interconnect and I/O systems are described below, and summarized in Tables 1.1 through 1.3.

Page 3: Sun Constellation Linux Cluster

Ranger is a blade-based system. Each node is a SunBlade x6420 blade running a 2.6.18.8 Linux kernel. Each node contains four AMD Opteron Quad-Core 64-bit processors (16 cores in all) on a single board, as an SMP unit. The core frequency is 2.3 GHz and supports 4 floating-point operations per clock period with a peak performance of 9.2 GFLOPS/core or 128 GFLOPS/node.

Each node contains 32 GB of memory. The memory subsystem has a 1.0 GHz HyperTransport system Bus, and 2 channels with 667 MHz DDR2 DIMMS. Each socket possesses an independent memory controller connected directly to an L3 cache.

The interconnect topology is a 7-stage, full-CLOS fat tree with two large Sun InfiniBand Datacenter switches at the core of the fabric (each switch can support up to a maximum of 3,456 SDR InfiniBand ports).  Each of the 328 compute chassis is connected directly to the 2 core switches. Twelve additional frames are also connected directly to the core switches and provide files system, administration and login capabilities.

File systems: Ranger's file systems are hosted on 72 Sun x4500 disk servers, each containing 48 SATA drives, and six Sun x4600 metadata servers. From this aggregate space of 1.7PB, three global file systems are available for all users (see Table 1.5).

Table 1.1 System Configuration and PerformanceComponent Technology Performance/SizePeak FloatingPoint Operations

  579 TFLOPS (Peak)

Nodes(blades)Four Quad-Core AMD Opteron processors

3,936 Nodes / 62,976 Cores

Memory Distributed 123 TB AggregateShared Disk Lustre, parallel File System 1.7 PB Raw 

Interconnect InfiniBand Switch1 GB/sec unidirectional point-to-point bandwidth2.3 micro-sec max MPI latency

Table 1.2 SunBlade x6420 Compute NodeComponent TechnologySockets per Node/Cores per Socket

4/4 (Barcelona)

Clock Speed 2.3 GHzMemory Per Node 32 GBSystem Bus HyperTransport, 6.4 GB bidirectional

Memory2 GB DDR2/667, PC2-5300 ECC-registered DIMMs

PCI Express x8Compact Flash 8 GB

Table 1.3 Sun x4600 Login NodesComponent Technology2 login nodes(specific node selected whenaccessing ranger.tacc.utexas.edu )

(login3.tacc.utexas.edu)(login4.tacc.utexas.edu)

Sockets per Node/Cores per Socket 4/4 (Barcelona)Clock Speed 2.2 GHzMemory Per Node 32 GB

Page 4: Sun Constellation Linux Cluster

Table 1.4 AMD Barcelona ProcessorTechnology 64-bitClock Speed 2.3 GHzFP Results/Clock Period 4Peak Performance/core 9.2 GFLOPSL3 Cache 2 MB on-die (shared)L2 Cache 4 x 512 KBL1 Cache 64 KB

Table 1.5. Storage SystemsStorage Class Size Architecture Features

Local8 GB/node

Compact Flash not available to users (O/S only)

Parallel 1.7 PBLustre, Sun x4500 disk servers

72 Sun x4500 I/O data servers, 6 Sun x4600 Metadata

Ranch (Tape Storage)

2.8 PBSAM-FS (Storage Archive Manager)

10 GB/s connection through 4 GridFTP Servers

Table 1.6. Parallel File SystemsStorage Class Size Quota (per User) Retention Policy$HOME ~100 TB 6 GB Backed up nightly; Not purged$WORK ~200 TB 350 GB Not backed up; Not purged$SCRATCH ~800 TB 400 TB Not backed up; Purged every 10 days

Student Name :- Ahmad Abu ObaidStudent ID :- 20620136