real parallel computers. modular data centers background information recent trends in the...

17
Real Parallel Computers

Upload: dominick-harris

Post on 24-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Real Parallel Computers

Modular data centers

Background Information

Recent trends in the marketplace of high performance computing

Strohmaier, Dongarra, Meuer, Simon

Parallel Computing 2005

Short history of parallel machines

• 1970s: vector computers• 1990s: Massively Parallel Processors (MPPs)

– Standard microprocessors, special network and I/O

• 2000s: – Cluster computers (using standard PCs)– Advanced architectures (BlueGene)– Comeback of vector computer

(Japanese Earth Simulator)– IBM Cell/BE

• 2010s:– Multi-cores, GPUs– Cloud data centers

Performance development and predictions

Clusters

• Cluster computing– Standard PCs/workstations connected by fast network– Good price/performance ratio– Exploit existing (idle) machines or use (new) dedicated

machines

• Cluster computers vs. supercomputers (MPPs)– Processing power similar: based on microprocessors– Communication performance was the key difference– Modern networks have bridged this gap

• (Myrinet, Infiniband, 10G Ethernet)

Overview

• Cluster computers at our department– DAS-1: 128-node Pentium-Pro / Myrinet cluster (gone)– DAS-2: 72-node dual-Pentium-III / Myrinet-2000 cluster– DAS-3: 85-node dual-core dual Opteron / Myrinet-10G– DAS-4: 72-node cluster with accelerators (GPUs etc.)

• Part of a wide-area system:– Distributed ASCI Supercomputer

Distributed ASCI Supercomputer(1997-2001)

DAS-2 Cluster (2002-2006)

• 72 nodes, each with 2 CPUs (144 CPUs in total)

• 1 GHz Pentium-III• 1 GB memory per node• 20 GB disk• Fast Ethernet 100 Mbit/s• Myrinet-2000 2 Gbit/s (crossbar)• Operating system: Red Hat Linux• Part of wide-area DAS-2 system

(5 clusters with 200 nodes in total)Myrinet switch

Ethernet switch

DAS-3 Cluster (Sept. 2006)

• 85 nodes, each with 2 dual-core CPUs(340 cores in total)

• 2.4 GHz AMD Opterons (64 bit)• 4 GB memory per node• 250 GB disk• Gigabit Ethernet • Myrinet-10G 10 Gb/s (crossbar)• Operating system: Scientific Linux• Part of wide-area DAS-3 system (5 clusters; 263 nodes), using

SURFnet-6 optical network with 40-80 Gb/s wide-area links

DAS-3 NetworksNortel 5530 + 3 * 5510ethernet switch

85 compute nodes

85 * 1 Gb/s ethernet

Myri-10G switch

85 * 10 Gb/s Myrinet

10 Gb/s ethernet blade

8 * 10 Gb/s eth (fiber)

Nortel OME 6500with DWDM blade

80 Gb/s DWDMSURFnet6

1 or 10 Gb/s Campus uplink

Headnode(10 TB mass storage)

10 Gb/s Myrinet

10 Gb/s ethernet

Myrinet

Nortel

DAS-3 Networks

DAS-4 (Sept. 2010)• 72 nodes (2 quad-core Intel Westmere Xeon E5620,

24 GB memory, 2 TB disk)

• 2 fat nodes with 94 GB memory

• Infiniband network + 1 Gb/s Ethernet

• 16 NVIDIA GTX 480 graphics accelerators (GPUs)

• 2 Tesla C2050 GPUs

DAS-4 performance

• Infiniband network:

• - One-way latency: 1.9 microseconds

• - Throughput: 22 Gbit/s

• CPU performance:

• - 72 nodes (576 cores): 4399.0 GFLOPS

Blue Gene/L Supercomputer

Blue Gene/L

2.8/5.6 GF/s4 MB

2 processors

2 chips, 1x2x1

5.6/11.2 GF/s1.0 GB

(32 chips 4x4x2)16 compute, 0-2 IO cards

90/180 GF/s16 GB

32 Node Cards

2.8/5.6 TF/s512 GB

64 Racks, 64x32x32

180/360 TF/s32 TB

Rack

System

Node Card

Compute Card

Chip

Blue Gene/L Networks3 Dimensional Torus

– Interconnects all compute nodes (65,536)

– Virtual cut-through hardware routing

– 1.4Gb/s on all 12 node links (2.1 GB/s per node)

– 1 µs latency between nearest neighbors, 5 µs to the farthest

– Communications backbone for computations

– 0.7/1.4 TB/s bisection bandwidth, 68TB/s total bandwidth

Global Collective

– One-to-all broadcast functionality

– Reduction operations functionality

– 2.8 Gb/s of bandwidth per link

– Latency of one way traversal 2.5 µs

– Interconnects all compute and I/O nodes (1024)

Low Latency Global Barrier and Interrupt

– Latency of round trip 1.3 µs

Ethernet

– Incorporated into every node ASIC

– Active in the I/O nodes (1:8-64)

– All external comm. (file I/O, control, user interaction, etc.)

Control Network