qnibterminal plus infiniband - containerized mpi workloads
DESCRIPTION
In this deck, Christian Kniep presents: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads. Watch the video presentation: http://wp.me/p3RLHQ-dvMTRANSCRIPT
![Page 1: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/1.jpg)
QNIBTerminal plus InfiniBandContainerized MPI Workloads
2014-11-05Christian Kniep
insideHPC EditionSlides slightly modified in comparison
to the HPC Advisory Council
![Page 2: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/2.jpg)
Agenda• Docker in a Nutshell • QNIBTerminal
• Testbed • MPI Benchmark • HPCG-Results
• Future Work • Conclusion
2
![Page 3: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/3.jpg)
Docker in a Nutshell
3
• (chroot on steroids)2
![Page 4: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/4.jpg)
• Builds on-top LinuX Containers (LXC)
• Kernel namespaces (isolation)
• cgroups (resource mgmt)
Docker in a Nutshell
4
• (chroot on steroids)2
![Page 5: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/5.jpg)
• intuitive build system
Docker in a Nutshell
5
• (chroot on steroids)2
• Builds on-top LinuX Containers (LXC)
• Kernel namespaces (isolation)
• cgroups (resource mgmt)
![Page 6: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/6.jpg)
• RedHat backing
• public repositories
• intuitive build system
Docker in a Nutshell
6
• (chroot on steroids)2
• Builds on-top LinuX Containers (LXC)
• Kernel namespaces (isolation)
• cgroups (resource mgmt)
![Page 7: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/7.jpg)
Traditional vs. Lightweight Layers
7
SERVER
HOST KERNEL
HYPERVISOR
KERNEL
SERVICE
Userland (OS)
KERNEL KERNEL
Userland (OS)Userland (OS) Userland (OS)
SERVICE SERVICE
SERVER
HOST KERNEL
SERVICE
Userland (OS)
Userland (OS)Userland (OS) Userland (OS)
SERVICE SERVICE
Traditional Virtualisation Containerisation
IB
IB
![Page 8: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/8.jpg)
QNIBTerminalMotivation
8
Plain Metrics
![Page 9: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/9.jpg)
QNIBTerminalMotivation
9
Plain Log Events
![Page 10: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/10.jpg)
QNIBTerminalMotivation
10
Overlap Metrics/Log Events
![Page 11: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/11.jpg)
QNIBTerminal Overview
11
haproxy haproxy
dnshelixdns
elk
kibana
logstash
etcd
carboncarbon
graphite-webgraphite-web
graphite-apigraphite-api
grafanagrafana
slurmctldslurmctld
compute0slurmd
compute<N>slurmd
Log/Events
Services Performance
Compute
elasticsearch
One Node Setup• All network traffic over bridge• Crippled MPI workload
![Page 12: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/12.jpg)
• Multiple Open MPI version installed
• gcc versions
• 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12)
• SLURM Resource Scheduler
• 1 native partition
• 3 containers partitions
Testbed
12
• 8 nodes (CentOS 7, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
![Page 13: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/13.jpg)
• osu-micro-benchmarks-4.4.1
• osu_alltoall with two tasks on two hosts
13MPI benchmark was not in original HPC Advisory Council Presentation
MPI Benchmark
$ mpirun -np 2 -H venus001,venus002 $(pwd)/osu_alltoall# OSU MPI All-to-All Personalized Exchange Latency Test v4.4.1# Size Avg Latency(us)1 1.832 1.824 1.748 1.6316 1.6232 1.6864 1.80128 2.77256 3.11512 3.51
![Page 14: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/14.jpg)
MPI Benchmarkdistribution’s results [2 task @2nodes]
14
late
ncy
[us]
0
1
2
3
4
5
Message Size (KB)
4 8 16 32 64 128 256 512 1024
native cos7 cos6 u12
MPI benchmark was not in original HPC Advisory Council Presentation
![Page 15: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/15.jpg)
15
late
ncy
[us]
0
0,7
1,4
2,1
2,8
distribution 1.5.4 1.6.4 1.8.3
nativecos7cos6u12
oMPI 1.6.4
oMPI 1.6.4
oMPI 1.5.4
oMPI 1.5.4
MPI BenchmarkOpen MPI comparison [2 task @2nodes, avg(1B->64B)]
![Page 16: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/16.jpg)
• mimics thermodynamic application workload
• Linpack corrective / successor in the long-term?
16
HPCG Benchmark
![Page 17: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/17.jpg)
17
GFL
OP/
s
3
3,75
4,5
5,25
6
native cos7 cos6 u12
CentOS 7.0 oMPI 1.6.4 gcc 4.8.2
HPCG Benchmarkdistribution’s results
![Page 18: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/18.jpg)
18
GFL
OP/
s
3
3,75
4,5
5,25
6
native cos7 cos6 u12
CentOS 7.0 oMPI 1.6.4 gcc 4.8.2
CentOS 6.5 oMPI 1.5.4 gcc 4.4.7
Ubuntu12.04 oMPI 1.5.4 gcc 4.6.3
HPCG Benchmarkdistribution’s results
![Page 19: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/19.jpg)
19
GFL
OP/
s
3
3,75
4,5
5,25
6
distribution
nativecos7cos6u12
oMPI 1.6.4
oMPI 1.6.4
oMPI 1.5.4
oMPI 1.5.4
HPCG BenchmarkOpen MPI comparison
![Page 20: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/20.jpg)
20
GFL
OP/
s
3
3,75
4,5
5,25
6
distribution 1.6.4 1.8.4
nativecos7cos6u12
oMPI 1.6.4
oMPI 1.6.4
oMPI 1.5.4
oMPI 1.5.4
gcc 4.8.2gcc 4.8.2gcc 4.4.7gcc 4.6.3
HPCG BenchmarkOpen MPI comparison
![Page 21: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/21.jpg)
21
GFL
OP/
s
3
3,75
4,5
5,25
6
distribution 1.5.4 1.6.4 1.8.4
nativecos7cos6u12
oMPI 1.6.4
oMPI 1.6.4
oMPI 1.5.4
oMPI 1.5.4
gcc 4.8.2gcc 4.8.2gcc 4.4.7gcc 4.6.3
HPCG BenchmarkOpen MPI comparison
![Page 22: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/22.jpg)
• Security evaluations
• Compare different frameworks to orchestrate
• Use of SV-IOR (Keynote earlier today)
• Compare with tuned bare-metal
• Tune docker installation
Future Work
22
• Benchmark real-world applications
![Page 23: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/23.jpg)
• Out-of-the-box: container beats bare-metal
• Continuous testing/deployment of containerized workloads
• Bare-metal kernel provides access to IB
• Container in charge from MPI upwards
Conclusion
23
• Bunch of tooling within docker ecosystem
• Abstraction bare-metal / application works fine
• Low performance overhead
![Page 24: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/24.jpg)
• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org
La Fin
24
https://www.flickr.com/photos/dharmabum1964/3108162671
![Page 25: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/25.jpg)
• Paper: http://doc.qnib.org/
• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org
La Fin
25
https://www.flickr.com/photos/dharmabum1964/3108162671
![Page 26: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/26.jpg)
La Fin
26
https://www.flickr.com/photos/dharmabum1964/3108162671
• Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks
• Paper: http://doc.qnib.org/
• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org
![Page 27: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads](https://reader033.vdocuments.mx/reader033/viewer/2022052601/559459a21a28ab4c728b45a5/html5/thumbnails/27.jpg)
La Fin
27
https://www.flickr.com/photos/dharmabum1964/3108162671
• Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks
• Questions?
• Paper: http://doc.qnib.org/
• Contact • @CQnib / @_qnib • [email protected] • http://qnib.org