title — calibri bold 26pt - heanet › 2016 › files › 240 › lt10 hpc101.pdf ·...
TRANSCRIPT
![Page 1: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/1.jpg)
HPC 101HEAnet National Conference 2016
Paddy DoyleSenior Sysadmin – Research IT / TCHPC (IT Services)
Date 2016-11-03
![Page 2: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/2.jpg)
Trinity College Dublin, The University of Dublin
![Page 3: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/3.jpg)
Brief Overview
Big picture
Motivation for HPC
What that means for software
What that means for hardware
Typical day as a HPC sysadmin
![Page 4: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/4.jpg)
Trinity College Dublin, The University of Dublin
Big Picture
Large industry
– Circa $10 billion annual spend
Major vendors
– HP, IBM, Dell, SGI, Fujitsu, Intel
Largest HPC systems:
– 10,000,000s of CPU cores
– Many 10,000s of nodes
– 100s of cabinets
– 15MW of power!
High Performance Computing in numbers
![Page 5: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/5.jpg)
Trinity College Dublin, The University of Dublin
Measuring Performance: Top500.org
High Performance LINPACK benchmark
– Dense linear algebra
FLOPS: FLoating-point Operations Per Second
List of most powerful machines
Machine Performance FLOPS
Typical PC 100 GFLOPS 100,000,000,000
Sunway TaihuLight (#1) 93 PFLOPS 93,000,000,000,000,000
![Page 6: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/6.jpg)
Trinity College Dublin, The University of Dublin
Top 500 Performance DevelopmentCurrently Peta-scale; when will we reach Exa-scale?
![Page 7: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/7.jpg)
Trinity College Dublin, The University of Dublin
Motivation for HPC
Bigger:
– memory-bound problems
Faster:
– CPU-bound problems
“HPC is the art of getting bigger things done faster” – D. Frost
![Page 8: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/8.jpg)
Trinity College Dublin, The University of Dublin
What that means for software
Parallel languages and libraries
– MPI, OpenMP, CUDA, OpenCL, PGAS
– BLAS, MKL, ATLAS, FFTW, Boost, PLASMA, PETSc
System administration
– Resource manager, queuing system
– Uniform environments
– Parallel filesystem (100s or 1000s of client nodes)
Software must communicate between cores and compute nodes
![Page 9: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/9.jpg)
Trinity College Dublin, The University of Dublin
What that means for hardware
Specialised hardware vs commodity servers
– Cray, IBM BlueGene
CPU: many-core, larger caches
Accelerator cards:
– GPGPU, Intel Xeon PHI
High-speed, low-latency networks
– Infiniband (40, 56, 96Gb/s; <1µs)
– Topologies: fat-tree, torus
Parallel filesystem
– Fast spinning disk, flash drives, hierarchies
Many cores, fast networking
![Page 10: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/10.jpg)
Trinity College Dublin, The University of Dublin
Typical day of HPC sysadmin
[Occasionally] design, rack, install, provision new systems
What software do researchers need?
– ‘yum install’ or ‘./configure; make’
– Build gcc-6.2.0, then openmpi-2.0.1 using gcc, then boost-1.62 using both, THEN try to compile their software
– Compile scientific software (sometimes without Makefiles)
– Complex software stack!
Node / queue / network: health checks and auto-remediation
Tweak provisioning config (Salt, Ansible, Puppet etc)
“Why did my job fail?”
![Page 11: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/11.jpg)
Thank You
![Page 12: Title — Calibri Bold 26pt - HEAnet › 2016 › files › 240 › LT10 HPC101.pdf · 2016-11-21 · Trinity College Dublin, The University of Dublin Big Picture Large industry –Circa](https://reader033.vdocuments.mx/reader033/viewer/2022042322/5f0c09747e708231d4337004/html5/thumbnails/12.jpg)
Trinity College Dublin, The University of Dublin
References / Sources
– https://www.nextplatform.com/2016/06/22/hpc-spending-outpaces-market-will-continue/
– https://www.top500.org/statistics/list/
– https://www.olcf.ornl.gov/titan/
– https://www.top500.org/statistics/perfdevel/
– http://neilashton.co.uk/publications/
– http://hiwpp.noaa.gov/hpc/
– http://www.hpc-ch.org/first-realistic-simulation-of-the-formation-of-the-milky-way-computed-at-cscs/
– https://becksteinlab.physics.asu.edu/learning/53/density-functional-theory-simulation-of-rhodium-nanoframes-and-carbon-nanotube-graphene-pillars
– http://info.adtechglobal.com/blog/bid/304327/Don-t-Forget-the-Fabric-The-Role-of-High-bandwidth-Low-latency-Interconnects-in-High-Performance-Clusters
– https://computing.llnl.gov/tutorials/bgq/
– http://frabz.com/meme-generator/what-i-do/
– http://vignette2.wikia.nocookie.net/matrix/images/d/df/Thematrixincode99.jpg/revision/latest?cb=20140425045724
– http://www.quickmeme.com/meme/355ovv