supercomputing on windows clusters: experience and future directions andrew a. chien cto, entropia,...

56
Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering, UCSD National Computational Science Alliance Invited Talk, USENIX Windows, August 4, 2000

Upload: cleopatra-todd

Post on 02-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Supercomputing on Windows Clusters: Experience and Future Directions

Andrew A. ChienCTO, Entropia, Inc.SAIC Chair ProfessorComputer Science and Engineering, UCSDNational Computational Science Alliance

Invited Talk, USENIX Windows, August 4, 2000

Page 2: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Overview

Critical Enabling Technologies The Alliance’s Windows Supercluster

– Design and Performance Other Windows Cluster Efforts Future

– Terascale Clusters– Entropia

Page 3: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

External Technology Factors

Page 4: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Microprocessor Performance

Year Introduced

DEC Alpha (5)

Clo

ck (

ns)

100

10

1

1975 1980 1985 1990 1995

MIPS R2000 (125)

MIPS R3000 (40)

HP 7000 (15)R4000 (10)

R4400 (6.7)Cray 1S (12.5)

Cray X-MP (8.5)

Cray Y-MP (6)Cray C90 (4.2)

Vector supercomputers

Microprocessors

X86/Alpha (1)

Micros: 10MF -> 100 MF -> 1GF -> 3GF -> 6GF (2001?) => Memory system performance catching up (2.6 GB/s 21264 memory BW)

Adapted from Baskett, SGI and CSC Vanguard

Page 5: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Killer Networks

LAN: 10Mb/s -> 100Mb/s -> ? SAN: 12MB/s -> 110MB/s

(Gbps) -> 1100MB/s -> ?– Myricom, Compaq, Giganet, Intel,...

Network bandwidths limited by system internal memory bandwidths

Cheap and very fast communication hardware

GigSAN/GigE: 110 MB/s

UW Scsi: 40 MB/s

FastE: 12 MB/sEthernet 1MB/s

Page 6: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Rich Desktop Operating Systems Environments

Desktop (PC) operating systems now provide– richest OS functionality– best program development tools– broadest peripheral/driver support– broadest application software/ISV support

1981 1985 1995 19991990

Basic device access

Graphical InterfacesAudio/Graphics

HD StorageNetworks

Multiprocess ProtectionSMP support

Clustering, Performance,Mass store, HP networking,Management, Availability, etc.

Page 7: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Critical Enabling Technologies

Page 8: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Critical Enabling Technologies

Cluster management and resource integration (“use like” one system)

Delivered communication performance– IP protocols inappropriate

Balanced systems– Memory bandwidth– I/O capability

Page 9: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

The HPVM System

Goals– Enable tightly coupled and distributed clusters with high efficiency

and low effort (integrated solution)– Provide usable access thru convenient standard parallel interfaces

– Deliver highest possible performance and simple programming model

Page 10: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Delivered Communication Performance Early 1990’s, Gigabit testbeds

– 500Mbits (~60MB/s) @ 1 MegaByte packets– IP protocols not for Gigabit SAN’s

Cluster Objective: High performance communication to small and large messages

Performance Balance Shift: Networks faster than I/O, memory, processor

Page 11: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Fast Messages Design Elements

User-level network access Lightweight protocols

– flow control, reliable delivery– tightly-coupled link, buffer, and I/O bus management

Poll-based notification Streaming API for efficient composition

Many generations 1994-1999– [IEEE Concurrency, 6/97]– [Supercomputing ’95, 12/95]

Related efforts: UCB AM, Cornell U-Net,RWCP PM, Princeton VMMC/Shrimp, Lyon BIP => VIA standard

Page 12: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Improved Bandwidth

20MB/s -> 200+ MB/s (10x) – Much of advance is software structure: API’s and implementation– Deliver *all* of the underlying hardware performance

0

50

100

150

200

250

1995 1996 1997 1998 1999

Pe

rfo

rma

nce

(m

eg

ab

yte

s/se

c)

MB/s

Page 13: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Improved Latency

100s to 2s overhead (50x)– Careful design to minimize overhead while maintaining throughput– Efficient event handling, fine-grained resource management and interlayer coordination– Deliver *all* of the underlying hardware performance

0

5

10

15

20

25

1995 1996 1997 1998 1999

1-w

ay

late

ncy

microseconds

Page 14: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

HPVM = Cluster Supercomputers

Turnkey Cluster Computing; Standard API’s Network hardware and API’s increase leverage for users, achieve critical mass for system Each involved new research challenges and provided deeper insights into the research issues

– Drove continually better solutions (e.g. multi-transport integration, robust flow control and queue management)

Fast Messages

MPI Put/GetGlobalArrays

BSP

MyrinetServer-

NetGiganet

VIASMP WAN

Scheduling & Mgmt (LSF)

PerformanceTools

HPVM 1.0 (8/1997)HPVM 1.2 (2/1999) - multi, dynamic, installHPVM 1.9 (8/1999) - giganet, smp

Page 15: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

HPVM Communication Performance

Delivers underlying performance for small messages, endpoints are the limits 100MB/s at 1K vs 60MB/s at 1000K

– >1500x improvement

0

20

40

60

80

100

120

4

512

2688

3584

4480

5376

6272

7168

8064

8960

9856

1075

2

1164

8

1254

4

1344

0

1433

6

1523

2

1612

8

message size (bytes)

MB

/s FM on Myrinet

MPI on FM-Myrinet

• N1/2 ~ 400 Bytes

Page 16: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

HPVM/FM on VIA

FM Protocol/techniques portable to Giganet VIA Slightly lower performance, comparable N1/2

Commercial version: WSDI (stay tuned)

0

10

20

30

40

50

60

70

80

90

410

2429

4439

6849

9260

1670

4080

6490

88

1011

2

1113

6

1216

0

1318

4

1420

8

1523

2

1625

6

message size (bytes)

MB

/s FM on Giganet VIA

MPI-FM on Giganet VIA

• N1/2 ~ 400 Bytes

Page 17: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Unified Transfer and Notification (all transports)

Solution: Uniform notify and poll (single Q representation) Scalability: n into k (hash); arbitrary SMP size or number of NIC cards Key: integrate variable-sized messages; achieve single DMA transfer

– no pointer-based memory management, no special synchronization primitives, no complex computation

Memory format provides atomic notification in single contiguous memory transfer (bcopy or DMA)

Procs

Networks

Fixed SizeFrames

Variable SizeData

Fixed Size Trailer+ Length/Flag

<space>

IncreasingAddresses

Page 18: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Integrated Notification Results

No polling or discontiguous access performance penalties Uniform high performance which is stable over changes of

configuration or the addition of new transports– no custom tuning for configuration required

Framework is scalable to large numbers of SMP processors and network interfaces

Single Transport IntegratedMyrinet (latency) 8.3s 8.4sMyrinet (BW) 101MB/s 101MB/sShared Memory (latency) 3.4s 3.5sShared Memory (BW) 200+MB/s 200+MB/s

Page 19: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Supercomputer Performance Characteristics (11/99)

MF/Proc Flops/Byte Flops/NetworkRTCray T3E 1200 ~2 ~2,500

SGI Origin2000 500 ~0.5 ~1,000

HPVM NT Supercluster 600 ~8 ~12,000

IBM SP2 (4 or 8-way) 2.6-5.2GF ~12-25 ~150-300K

Beowulf(100Mbit) 600 ~50 ~200,000

Page 20: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

The NT SuperclusterWindows

Page 21: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Windows Clusters Early prototypes in CSAG

– 1/1997, 30P, 6GF

– 12/1997, 64P, 20GF

Alliance’s Supercluster– 4/1998, 256P, 77GF– 6/1999, 256P*, 109GF

Page 22: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

NCSA’s Windows Supercluster

Rob Pennington (NCSA), Andrew Chien (UCSD)

Using NT, Myrinet Interconnect, and HPVM

128 HP Kayak XU Dual PIII 550 MHz/1GB RAM

Origin

550 MHz

300 MHz

Engineering Fluid Flow Problem

D. Tafti, NCSA#207 in Top 500

Supercomputing Sites

Page 23: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

FTP to Mass StorageDaily backups

Internet

Front-End Systems

LSF BatchJob Scheduler•Apps development

•Job submission

128 Compute Nodes, 256 CPUs

128 GB Home200 GB Scratch

Fast Ethernet

File serversLSF master

Infrastructure and Development Testbeds

Windows 2K and NT

8 4p 550 + 32 2p 300 + 8 2p 333

Windows NT, Myrinet and HPVM

128 Dual 550 MHz Systems

Windows Cluster System

(courtesy Rob Pennington, NCSA)

Page 24: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Example Application Results MILC – QCD Navier-Stokes Kernel Zeus-MP – Astrophysics CFD

Large-scale Science and Engineering codes

Comparisons to SGI O2K and Linux clusters

Page 25: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSASrc: D. Toussaint and K. Orginos, Arizona

0

2

4

6

8

10

12

0 50 100Processors

GF

LO

Ps

IA-32/Win NT, 300 MHz PII250 MHz SGI O2KT3E 900IA-32/Win NT 550MHz Xeon

MILC Performance

Page 26: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Zeus-MP (Astrophysics CFD)

0

1000

2000

3000

4000

50006000

7000

8000

9000

10000

1 4 16 32 64 96 128 192 256

# procs

MF

lop

s/se

c

SGI O2K

Janus (ASCI Red)

NT Supercluster550 Mhz

Page 27: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

2D Navier Stokes Kernel

Source: Danesh Tafti, NCSA

AS-PCG MPI Performance - 2D Navier Stokes Kernel

0

2

4

6

8

10

12

14

16

18

20

0 32 64 96 128 160 192 224 256Processors

GF

LO

Ps

SGI O2000, 250 MHz R10000

NT Cluster: Intel 550 MHz PIII Xeon HP Kayak

NT Cluster: Intel 300MHz PII HP Kayak

Cluster: 128 550MHz + 128 300 MHz

128 300 MHz Intel Pentium II + 128 550 MHz Pentium III Xeon

Page 28: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Applications with High Performance on Windows Supercluster

Zeus-MP (256P, Mike Norman) ISIS++ (192P, Robert Clay) ASPCG (256P, Danesh Tafti) Cactus (256P, Paul Walker/John Shalf/Ed Seidel) MILC QCD (256P, Lubos Mitas) QMC Nanomaterials (128P, Lubos Mitas) Boeing CFD Test Codes, CFD Overflow (128P, David Levine) freeHEP (256P, Doug Toussaint) ARPI3D (256P, weather code, Dan Weber) GMIN (L. Munro in K. Jordan) DSMC-MEMS (Ravaioli) FUN3D with PETSc (Kaushik) SPRNG (Srinivasan) MOPAC (McKelvey) Astrophysical N body codes (Bode) => Little code retuning and quickly running ... Parallel Sorting (Rivera – CSAG),

18.3 GB Minutesort World Record

Page 29: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

MinuteSort

Sort max data disk-to-disk in 1 minute “Indy sort”

– fixed size keys, special sorter, and file format

HPVM/Windows Cluster winner for 1999 (10.3GB) and 2000 (18.3GB)– Adaptation of Berkeley NOWSort code (Arpaci and

Dusseau)

Commodity configuration ($$ not a metric)– PC’s, IDE disks, Windows– HPVM and 1Gbps Myrinet

Page 30: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

MinuteSort ArchitectureKayak

Kayak

Netserver

(Luis Rivera UIUC, Xianan Zhang UCSD)

32 HP Kayaks3Ware Controllers4 x 20GB IDE disks

32 HP Netservers2 x 16GB SCSI disks

HPVM & 1Gbps Myrinet

Page 31: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Sort Scaling

Concurrent read/bucket-sort/communicate is bottleneck – faster I/O infrastructure required (busses and memory, not disks)

Page 32: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

MinuteSort Execution Time

Page 33: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Reliability

Gossip: “Windows platforms are not reliable”– Larger systems => intolerably low MTBF

Our Experience: “Nodes don’t crash”– Application runs of 1000s of hours– Node failure means an application failure; effectively not a

problem Hardware

– Short term: Infant mortality (1 month burn-in)– Long term

• ~1 hardware problem/100 machines/month• Disks, network interfaces, memory• No processor or motherboard problems.

Page 34: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Windows Cluster Usage

Lots of large jobs Runs up to ~14,000 hours (64p * 9 days)

NT Cluster Usage by Number of ProcessorsMay1999 to Jul2000

0

100000

200000

300000

400000

500000

1 - 31 32 - 63 64 - 256

Number of Processors

CP

U H

ou

rs

Page 35: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Other Large Windows Clusters Sandia’s Kudzu Cluster (144 procs, 550 disks, 10/98) Cornell’s AC3 Velocity Cluster (256 procs, 8/99) Others (sampled from vendors)

– GE Research Labs (16, Scientific)– Boeing (32, Scientific)– PNNL (96, Scientific)– Sandia (32, Scientific)– NCSA (32, Scientific)– Rice University (16, Scientific)– U. of Houston (16, Scientific)– U. of Minnesota (16, Scientific)– Oil & Gas (8,Scientific) – Merrill Lynch (16, Ecommerce)– UIT (16, ASP/Ecommerce)

Page 36: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA(courtesy David A. Lifka, Cornell TC)

The AC3 Velocity

64 Dell PowerEdge 6350 Servers• Quad Pentium III 500 MHz/2 MB Cache Processors (SMP)• 4 GB RAM/Node• 50 GB Disk (RAID 0)/Node

GigaNet Full Interconnect• 100 MB/Sec Bandwidth between any 2 Nodes• Very Low Latency

2 Terabytes Dell PowerVault 200S Storage• 2 Dell PowerEdge 6350 Dual Processor File Servers• 4 PowerVault 200S Units/File Server• 8 36 GB/Disk Drives/PowerVault 200S• Quad Channel SCSI Raid Adapter• 180 MB/sec Sustained Throughput/ Server

2 Terabyte PowerVault 130T Tape Library• 4 DLT 7000 Tape Drives• 28 Tape Capacity

#381 in Top 500Supercomputing Sites

Page 37: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Recent AC3 Additions

8 Dell PowerEdge 2450 Servers (Serial Nodes)• Pentium III 600 MHz/512 KB Cache • 1 GB RAM/Node• 50 GB Disk (RAID 0)/Node

7 Dell PowerEdge 2450 Servers (First All NT Based AFS Cell)

• Dual Processor Pentium III 600 MHz/512 KB Cache • 1 GB RAM/Node Fileservers, 512 MB RAM/Node Database servers• 1 TB SCSI based RAID 5 Storage• Cross platform filesystem support

64 Dell PowerEdge 2450 Servers (Protein Folding, Fracture Analysis)

• Dual Processor Pentium III 733 Mhz/256 KB Cache• 2 GB RAM/Node• 27 GB Disk (RAID 0)/Node• Full Giganet Interconnect

3 Intel ES6000 & 1 ES1000 Gigabit switches• Upgrading our Server Backbone network to Gigabit Ethernet

(courtesy David A. Lifka, Cornell TC)

Page 38: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

AC3 Goals

Only commercially supported technology– Rapid spinup and spinout– Package technologies for vendors to sell as

integrated systems => All of commercial packages were moved

from SP2 to Windows, all users are back and more!

Users: “I don’t do windows” => “I’m agnostic about operating systems, and

just focus on getting my work done.”

Page 39: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Protein Folding

http://www.tc.cornell.edu/reports/NIH/resource/CompBiologyTools/

The cooperative motion of ion and water through the gramicidin ion channel. The effective quasi-article that permeates through the channel includes eight water molecules and the ion. Work of Ron Elber with Bob Eisenberg, Danuta Rojewska and Duan Pin.

Reaction path study of lig and diffusion in leghemoglobin. The ligand is CO (white) and it is moving from the binding site, the heme pocket, to the protein exterior. A study by Weislaw Nowak and Ron Elber.

(courtesy David A. Lifka, Cornell TC)

Page 40: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Protein Folding Per/Processor Performance

Results on different computers for protein structures:Machine System CPU CPU speed

[MHz]compiler Energy evaluations per

secondBlue Horizon (SP San Diego)

AIX 4 Power3 222 xlf  44.3

Linux cluster Linux 2.2 PentiumIII 

650 PGF 3.1 59.1

Velocity (CTC) Win 2000 PentiumIII Xeon

500 df v6.1 46.0

Velocity+ (CTC) Win 2000 PentiumIII 

733 df v6.1 59.2

Results on different computers for ( / or proteins):   Machine System CPU CPU speed

[MHz]compiler Energy evaluations per

secondBlue Horizon (SP San Diego)

AIX 4 Power3 222 xlf  15.0

Linux cluster Linux 2.2 PentiumIII 

650 PGF 3.1 21.0

Velocity (CTC) Win 2000 PentiumIII Xeon

500 df v6.1 16.9

Velocity+ (CTC) Win 2000 PentiumIII 

733 df v6.1 22.4

(courtesy David A. Lifka, Cornell TC)

Page 41: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

AC3 Corporate Members

-Air Products and Chemicals-Candle Corporation-Compaq Computer Corporation-Conceptual Reality Presentations-Dell Computer Corporation-Etnus, Inc.-Fluent, Inc.-Giganet, Inc.-IBM Corporation-ILOG, Inc.-Intel Corporation-KLA-Tencor Corporation-Kuck & Associates, Inc.

-Lexis-Nexis-MathWorks, Inc.-Microsoft Corporation-MPI Software Technologies, Inc.-Numerical Algorithms Group-Portland Group, Inc.-Reed Elsevier, Inc.-Reliable Network Solutions, Inc.-SAS Institute, Inc.-Seattle Lab, Inc.-Visual Numerics, Inc.-Wolfram Research, Inc.

(courtesy David A. Lifka, Cornell TC)

Page 42: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Windows Cluster Summary

Good performance Lots of Applications Good reliability Reasonable Management complexity (TCO) Future is bright; uses are proliferating!

Page 43: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Windows Cluster Resources

NT Supercluster, NCSA – http://www.ncsa.uiuc.edu/General/CC/ntcluster/ – http://www-csag.ucsd.edu/projects/hpvm.html

AC3 Cluster, TC– http://www.tc.cornell.edu/UserDoc/Cluster/

University of Southampton– http://www.windowsclusters.org/

=> application and hardware/software evaluation => many of these folks will work with you on deployment

Page 44: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Tools and Technologies for Building Windows Clusters

Communication Hardware– Myrinet, http://www.myri.com/– Giganet, http://www.giganet.com/– Servernet II, http://www.compaq.com/

Cluster Management and Communication Software– LSF, http://www.platform.com/– Codeine, http://www.gridware.net/– Cluster CoNTroller, MPI, http://www.mpi-softtech.com/ – Maui Scheduler http://www.cs.byu.edu/– MPICH, http://www-unix.mcs.anl.gov/mpi/mpich/– PVM, http://www.epm.ornl.gov/pvm/

Microsoft Cluster Info– Win2000 http://www.microsoft.com/windows2000/– MSCS,http://www.microsoft.com/ntserver/ntserverenterprise/exec/

overview/clustering.asp

Page 45: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Future Directions

Terascale ClustersEntropia

Page 46: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

A Terascale Cluster

NSF currently running a $36M Terascale competition Budget could buy

– an Itanium cluster (3000+ processors)– ~3TB of main memory– > 1.5Gbps high speed network interconnect

10+ Teraflops in 2000?

? #1 in Top 500 ?Supercomputing Sites

Page 47: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia: Beyond Clusters

• COTS, SHV enable larger, cheaper, faster systems• Supercomputers (MPP’s) to…• Commodity Clusters (NT Supercluster) to…• Entropia

Page 48: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Internet Computing

Idea: Assemble large numbers of idle PC’s in people’s homes, offices, into a massive computational resource– Enabled by broadband connections, fast microprocessors, huge PC volumes

Page 49: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Unprecedented Power

Entropia network: ~30,000 machines (and growing fast!)– 100,000, 1Ghz => 100 TeraOp system

– 1,000,000, 1Ghz => 1,000 TeraOp system (1 PetaOp) IBM ASCI White (12 TeraOp, 8K processors, $110 Million

system)

Page 50: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Why Participate: Cause Computing!

Page 51: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

People Will Contribute

Millions have demonstrated willingness to donate their idle cycles

“Great Cause” Computing– Current: Find ET, Large Primes, Crack DES…– Next: find cure for cancer, muscular dystrophy, air

and water pollution, …• understand human genome, ecology, fundamental

properties of matter, economy

Participate in science, medical research, promoting causes that you care about!

Page 52: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Technical Challenges Heterogeneity (machine, configuration, network) Scalability (thousands to millions) Reliability (turn off, disconnect, fail) Security (integrity, confidentiality) Performance Programming . . .

Entropia: harnessing the computational power of the Internet

Page 53: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Entropia is . . .

Power: a network with unprecedented power and scale

Empower: ordinary people to participate in solving the great social challenges and mysteries of our time

Solve: team solving fascinating technical problems

Page 54: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Summary

Windows clusters are powerful, successful high performance platforms– Cost effective and excellent performance– Poised for rapid proliferation

Beyond clusters are Internet computing systems– Radical technical challenges, vast and profound

opportunities For more information see

– HPVM http://www-csag.ucsd.edu/ – Entropia http://www.entropia.com/

Page 55: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA

Credits

NT Cluster Team Members– CSAG (UIUC and UCSD Computer Science) – my research

group– NCSA Leading Edge Site -- Robert Pennington’s team

Talk materials• NCSA (Rob Pennington, numerous application groups)

• Cornell TC (David Lifka)

• Boeing (David Levine)

• MPISoft (Tony Skjellum)

• Giganet (David Wells)

• Microsoft (Jim Gray)

Page 56: Supercomputing on Windows Clusters: Experience and Future Directions Andrew A. Chien CTO, Entropia, Inc. SAIC Chair Professor Computer Science and Engineering,

Entropia, Inc -- University of California, San Diego (UCSD/CSE) -- NCSA