chemical supercomputing on the cheap: cobalt cluster. hpc workshop, ottawa nov 17, 2000. p. 1...
Post on 18-Dec-2015
215 views
TRANSCRIPT
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 11
Chemical Supercomputing on Chemical Supercomputing on the Cheap: Cobalt clusterthe Cheap: Cobalt cluster
94GFlops computer system at 94GFlops computer system at cdn$3680cdn$3680**/gigaflop/gigaflop
*Mar’98 – Feb’99*Mar’98 – Feb’99 S. Patchkovskii, R. Schmid, and T. ZieglerS. Patchkovskii, R. Schmid, and T. Ziegler
Department of Chemistry, University of Calgary, Department of Chemistry, University of Calgary, 2500 University Dr. NW, Calgary, Alberta, 2500 University Dr. NW, Calgary, Alberta,
T2N 1N4 CanadaT2N 1N4 Canada
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 22
We are chemists, not We are chemists, not computer scientists; we do computer scientists; we do not build computers for a not build computers for a living – we build them for living – we build them for
doing chemistrydoing chemistry
DisclaimerDisclaimer
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 33
Outline of the talkOutline of the talk
• What do we need, and why build our own?What do we need, and why build our own?• What went in: Nodes and communication What went in: Nodes and communication
hardwarehardware• What it runs on: System and application What it runs on: System and application
softwaresoftware• How does it work: Stability, performance, and How does it work: Stability, performance, and
resource utilizationresource utilization• What can it do: Chemical research with CobaltWhat can it do: Chemical research with Cobalt• Conclusions and AcknowledgementsConclusions and Acknowledgements
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 44
Criteria for Cobalt designCriteria for Cobalt design
• Should provide a useful resource for Should provide a useful resource for 5 years5 years
• Minimal hardware purchase and maintenance Minimal hardware purchase and maintenance costcost– No low-volume “boutique” hardwareNo low-volume “boutique” hardware– No experimental “bleeding edge” hardwareNo experimental “bleeding edge” hardware
• Minimal system maintenanceMinimal system maintenance– Use software which has been around for a whileUse software which has been around for a while– If it works, do not tinker with it!If it works, do not tinker with it!
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 55
Cobalt NodesCobalt Nodes
Compaq Personal Workstation 500au.
(*) SpecInt and SpecFP values estimated from published results for a 500au system with 2Mb L3 cache.
For a comparison, in May’99 a top of the line 550MHz Intel Xeon workstation with 512Kb of L2 cache achieved 24.4 SpecInt 95 and 17.1 SpecFP 95 and cost about cdn$4400 from Dell.
CPUCPU Alpha 21164A, 500 MHzAlpha 21164A, 500 MHzCacheCache 96Kb on-chip (L1 and L2)96Kb on-chip (L1 and L2)MemoryMemory 64Mb to 512Mb64Mb to 512MbDiskDisk 4Gb, 7200rpm UltraSCSI4Gb, 7200rpm UltraSCSICD-ROMCD-ROM 24x24xGraphicsGraphics nonenoneNetworkNetwork FastEthernet, full duplexFastEthernet, full duplexPeak flopsPeak flops 101099 Flop/second Flop/secondSpecInt 95SpecInt 95 15.715.7**
SpecFP 95SpecFP 95 19.519.5**
Average priceAverage price cdn$3,468 (Mar’98-Apr’99)cdn$3,468 (Mar’98-Apr’99)
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 66
Cobalt NetworkCobalt Network
Latency and bandwidth measured with Larry McVoy’s Lmbench using otherwise idle nodes.
4 x 24-port 3com SuperStack II 3300, with matrix module. Maximum intra-switch bandwidth: 1Gbyte/sec.
BandwidthBandwidth (Mbytes/second)(Mbytes/second)Peak bisectionPeak bisection 25.0025.00Peak aggregatePeak aggregate 125.00125.00Bisection (TCP)Bisection (TCP) 11.2411.24NFS readNFS read 3.393.39NFS writeNFS write 4.144.14Local readLocal read 10.1110.11Local writeLocal write 5.585.58LatencyLatency (microseconds)(microseconds)Round-trip (TCP)Round-trip (TCP) 360360Round-trip (UDP)Round-trip (UDP) 354354Cost:Cost: cdn$13,500cdn$13,500
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 77
Keeping CoolKeeping Cool
Per-node heat Per-node heat dissipationdissipation
100-150W100-150W
Total heat Total heat dissipationdissipation
12KW12KW
Time to boil 1 liter Time to boil 1 liter of tap waterof tap water
28 seconds28 seconds
Time to increase air Time to increase air temperature by 10temperature by 10ºC ºC (assuming 4x10x4 m(assuming 4x10x4 m33 air air volume)volume)
170 seconds170 seconds
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 88
The ClusterThe Cluster
Node 1
Node 93
World
Switch 93x100BaseTx
100BaseTx
(half-duplex)
2x100BaseTx
128Mb memory18Gbytes RAID-1 (4 spindles)
CComputers omputers oon n bbenches enches aall ll llinked inked ttogetherogether
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 99
System softwareSystem software
• Operating system: Tru64 (Digital) UnixOperating system: Tru64 (Digital) Unix– Remote boot/local scratch setupRemote boot/local scratch setup– Centralized error monitoring facilityCentralized error monitoring facility– NFS and NIS, user files distributed over nodesNFS and NIS, user files distributed over nodes– C, C++, and Fortran bundled under CSLGC, C++, and Fortran bundled under CSLG
• Parallel programmingParallel programming– PVM and MPIPVM and MPI
• Batch queuingBatch queuing– DQSDQS
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1010
(*) Gaussian supports cluster environments with Network Linda - an extra-cost package is not available on Cobalt
Application softwareApplication software
PackagePackage SourceSource ParallelParallelAmsterdam Density Amsterdam Density Functional (ADF)Functional (ADF)
Co-developed at UofC and Co-developed at UofC and Vrije University (Netherlands)Vrije University (Netherlands)
PVM, MPIPVM, MPI
Projector-Augmented Plane Projector-Augmented Plane Wave (PAW) first-principles Wave (PAW) first-principles molecular dynamicsmolecular dynamics
Co-developed at UofC and Co-developed at UofC and IBM research laboratories IBM research laboratories (Z(Züürich)rich)
MPIMPI
Gaussian-94 (ab initio)Gaussian-94 (ab initio) UofC site licenceUofC site licence NoNo**
Visualization (Xmol, Rasmol, Visualization (Xmol, Rasmol, Viewkel, etc.)Viewkel, etc.)
Freely availableFreely available N/AN/A
Mopac 6 and 7 Mopac 6 and 7 (semiempirical)(semiempirical)
Freely availableFreely available NoNo
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1111
Total construction costTotal construction cost
Cluster nodes, 94xCluster nodes, 94x 325,992.00325,992.00
Network interconnectNetwork interconnect 13,500.0013,500.00
Assembly and misc. Assembly and misc. 6,500.006,500.00
System softwareSystem software bundledbundled
Application softwareApplication software 0.000.00
Tips to system administratorTips to system administrator 8.008.00
TotalTotal 346,000.00346,000.00
Per-nodePer-node 3,680.853,680.85
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1212
Parallel execution: ADFParallel execution: ADF
• Time-limiting step: Cartesian Time-limiting step: Cartesian space numerical integrationspace numerical integration
• Low demands on bandwidth Low demands on bandwidth and latencyand latency
• Infrequent synchronization due Infrequent synchronization due to replication of serial sectionsto replication of serial sections
• Parallel scaling is limited by Parallel scaling is limited by static local balancingstatic local balancing
Node 1
Node 2
Node 3T
ime
Communications
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1313
ADF: An exampleADF: An example
NitridoporphyrinatochromiumNitridoporphyrinatochromium
Full geometry optimizationFull geometry optimization
38 atoms38 atoms
580 basis functions580 basis functions
C4v symmetryC4v symmetry
45Mbytes of memory45Mbytes of memory
Serial time: 683 minutesSerial time: 683 minutes
Number of nodes
Sp
eed
up
idea
l
Observed
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1414
Parallel execution: PAWParallel execution: PAW
Node 1
Node 2
Node 3
Communications
Tim
e
• Time-limiting step: Fast Fourier Time-limiting step: Fast Fourier transformstransforms
• Extensive memory requirementsExtensive memory requirements
• Sensitive to bandwidth and latency; Sensitive to bandwidth and latency; >20% link bandwidth utilization on >20% link bandwidth utilization on CobaltCobalt
• Frequent synchronization, massive Frequent synchronization, massive data exchangedata exchange
• Parallel scaling is limited by the Parallel scaling is limited by the network interconnectnetwork interconnect
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1515
PAW: An examplePAW: An example
CHCH33I + [Rh(CO)I + [Rh(CO)22II22]]--, S, SNN2 reaction2 reaction
1111ÅÅ unit cell unit cell
Serial time per step: 83 secondsSerial time per step: 83 seconds
Memory: 231MbytesMemory: 231Mbytes
Sp
eed
up
Nodes
idea
l
Observed
Mem
ory
usa
geNodes
ideal
Measured
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1616
Stability: Node uptimeStability: Node uptime
1
2
3
4R
eboo
ts p
er n
ode
per
mon
th
Apr ‘98
Oct ‘98
Apr ‘99
Oct ‘99
Apr ‘00
Oct ‘00
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1717
Average CPU utilizationAverage CPU utilization
0
20
40
60
80
100A
vera
ge d
aily
CP
U u
tili
zati
on,
perc
ent
Mar’00
May’00
Jul’00
Sep’00
Nov’00
Average CPU utilization over last 9 months: 77%
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1818
Job size distributionJob size distribution
0
5
10
15
20
25
30
Fraction of time
Fraction of jobsPer
cent
age
of th
e to
tal
2 days
8 days
32 days
12 hours
3 hours
45 min.
11¼ min.
< 3 min.
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 1919
Chemical research with CobaltChemical research with Cobalt
Dynamical simulation of the ethylene insertion step in single-site Dynamical simulation of the ethylene insertion step in single-site polymerization, using a realistic model of the counterion (M. polymerization, using a realistic model of the counterion (M.
Chan and T. Ziegler)Chan and T. Ziegler)PAW simulation: 6 months x 8 Cobalt nodes; 25,000 time stepsPAW simulation: 6 months x 8 Cobalt nodes; 25,000 time steps
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 2020
Chemical research with CobaltChemical research with Cobalt
Polymerization and co-polymerization catalyzed by late transition Polymerization and co-polymerization catalyzed by late transition metal complexes (A. Michalak and T. Ziegler)metal complexes (A. Michalak and T. Ziegler)
ADF study: 6 months x 12 Cobalt nodesADF study: 6 months x 12 Cobalt nodes
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 2121
Chemical research with CobaltChemical research with Cobalt
Theoretical prediction of EPR g-tensors of transition metal Theoretical prediction of EPR g-tensors of transition metal complexes (S. Patchkovskii and T. Ziegler)complexes (S. Patchkovskii and T. Ziegler)
4 weeks x 4 Cobalt nodes4 weeks x 4 Cobalt nodes
MoO(SPh)41-, C4v
Calc. g = 1.927
MoO(SPh)41-, C4
Calc. g = 1.977
MoO(SPh)41-, ?
Expt. g = 1.979
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 2222
ConclusionsConclusions
• Commodity-based computer clusters are Commodity-based computer clusters are uniquely suited for quantum chemistry uniquely suited for quantum chemistry researchresearch
• Medium-sized clusters can be built easily, at Medium-sized clusters can be built easily, at PC-level per-node price pointsPC-level per-node price points
• Such clusters can be utilized efficiently with Such clusters can be utilized efficiently with only minimal system administration and only minimal system administration and maintenance maintenance
• Go for it!Go for it!
Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. Chemical Supercomputing on the Cheap: Cobalt cluster. HPC Workshop, Ottawa Nov 17, 2000. p. 2323
Credits and AcknowledgementsCredits and Acknowledgements
• Financial support for the construction of Cobalt Financial support for the construction of Cobalt was provided by:was provided by:– Canada Foundation for InnovationCanada Foundation for Innovation– Alberta Intellectual Infrastructure PartnershipAlberta Intellectual Infrastructure Partnership– Department of Chemistry, UofCDepartment of Chemistry, UofC– Scientific Chemistry Simulations, Inc.Scientific Chemistry Simulations, Inc.– Nova ChemicalsNova Chemicals– Mitsui ChemicalsMitsui Chemicals
• Many thanks to all members, current and past, Many thanks to all members, current and past, of Prof. Tom Ziegler’s research group at UofCof Prof. Tom Ziegler’s research group at UofC