![Page 1: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/1.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Building Brains - Parallelisation Strategies of Large-Scale Deep Learning Neural Networks on Parallel
Scale Out Architectures Like Apache Spark
Romeo Kienzler, IBM Watson IoT
![Page 2: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/2.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
![Page 3: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/3.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
![Page 4: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/4.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
![Page 5: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/5.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
![Page 6: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/6.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
the forward pass
![Page 7: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/7.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
back propagation
![Page 8: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/8.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
back propagation
![Page 9: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/9.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 10: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/10.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 11: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/11.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 12: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/12.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 13: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/13.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 14: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/14.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
gradient descent
![Page 15: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/15.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
‣inter-model parallelism ‣data parallelism ‣intra-model parallelism ‣pipelined parallelism
parallelisation
![Page 16: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/16.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A
![Page 17: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/17.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B
![Page 18: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/18.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B Model C
![Page 19: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/19.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B Model C
Node 1
![Page 20: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/20.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B Model C
Node 1 Node 2
![Page 21: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/21.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B Model C
Node 1 Node 2 Node 3
![Page 22: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/22.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
inter-model parallelismaka. hyper parameter space exploration / tuning
Model A Model B Model C
Node 1 Node 2 Node 3
![Page 23: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/23.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
data parallelismaka. “Jeff Dean style” parameter averaging
Model A
![Page 24: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/24.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
![Page 25: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/25.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Model A Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
![Page 26: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/26.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Node 1
Model A Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
![Page 27: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/27.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Node 1 Node 2
Model A Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
![Page 28: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/28.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Node 1 Node 2 Node 3
Model A Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
![Page 29: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/29.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Node 1 Node 2 Node 3
Model A Model A Model A
data parallelismaka. “Jeff Dean style” parameter averaging
Parameter Server
![Page 30: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/30.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Node 1 Node 2 Node 3
Model A Part 1
Model A Part 2
Model A Part 3
intra-model parallelismParameter Server
![Page 31: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/31.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
intra-model parallelism
![Page 32: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/32.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
v
intra-model parallelism
![Page 33: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/33.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
v
v
intra-model parallelism
![Page 34: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/34.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
v
v
v
intra-model parallelism
![Page 35: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/35.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
v
v
v
v
intra-model parallelism
![Page 36: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/36.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
![Page 37: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/37.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
0 0 0 1 1 0 1 1
![Page 38: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/38.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
0 0 0 1 1 0 1 1
![Page 39: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/39.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
0 0 0 1 1 0 1 1
![Page 40: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/40.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
0 0 0 1 1 0 1 1
![Page 41: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/41.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
pipelined parallelism
0 0 0 1 1 0 1 1
![Page 42: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/42.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Apache SparkDriver JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
![Page 43: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/43.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
DeepLearning4J
vs.
![Page 44: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/44.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
DeepLearning4J
vs.
data paralle
lism
![Page 45: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/45.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
ND4J (DeepLearning4J)
• Tensor support (Linear Buffer + Stride)
• Multiple implementations, one interface
• vectorized c++ code (JavaCPP), off-heap data storage, BLAS (OpenBLAS, Intel MKL, cuBLAS)
• GPU (CUDA 8)
![Page 46: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/46.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
ND4J (DeepLearning4J)
• Tensor support (Linear Buffer + Stride)
• Multiple implementations, one interface
• vectorized c++ code (JavaCPP), off-heap data storage, BLAS (OpenBLAS, Intel MKL, cuBLAS)
• GPU (CUDA 8)
intra-m
odel
paralle
lism
![Page 47: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/47.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Apache SystemML• Custom machine learning algorithms
• Declarative ML
• Transparent distribution on data-parallel framework
• Scale-up
• Scale-out
• Cost-based optimiser generates low level execution plans
![Page 48: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/48.jpg)
2009 2008 2007
2007-2008: Multiple projects at IBM Research – Almaden involving machine learning on Hadoop.
2010
2009-2010: Through engagements with customers, we observe how data scientists create ML solutions.
2009: We form a dedicated team for scalable ML
![Page 49: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/49.jpg)
2014 2013 2012 2011
Research
![Page 50: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/50.jpg)
2016 2015
June 2015: IBM Announces open-source SystemML
September 2015: Code available on Github
November 2015: SystemML enters Apache incubation
June 2016: Second Apache release (0.10)
February 2016: First release (0.9) of Apache SystemML
![Page 51: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/51.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Apache SystemMLU"="rand(nrow(X),"r,"min"="01.0,"max"="1.0);""V"="rand(r,"ncol(X),"min"="01.0,"max"="1.0);""while(i"<"mi)"{""""i"="i"+"1;"ii"="1;""""if"(is_U)"""""""G"="(W"*"(U"%*%"V"0"X))"%*%"t(V)"+"lambda"*"U;""""else"""""""G"="t(U)"%*%"(W"*"(U"%*%"V"0"X))"+"lambda"*"V;""""norm_G2"="sum(G"^"2);"norm_R2"="norm_G2;""""""""R"="0G;"S"="R;""""while(norm_R2">"10E09"*"norm_G2"&"ii"<="mii)"{""""""if"(is_U)"{""""""""HS"="(W"*"(S"%*%"V))"%*%"t(V)"+"lambda"*"S;""""""""alpha"="norm_R2"/"sum"(S"*"HS);""""""""U"="U"+"alpha"*"S;""""""""}"else"{""""""""HS"="t(U)"%*%"(W"*"(U"%*%"S))"+"lambda"*"S;""""""""alpha"="norm_R2"/"sum"(S"*"HS);""""""""V"="V"+"alpha"*"S;""""""""}""""""R"="R"0"alpha"*"HS;""""""old_norm_R2"="norm_R2;"norm_R2"="sum(R"^"2);""""""S"="R"+"(norm_R2"/"old_norm_R2)"*"S;""""""ii"="ii"+"1;""""}""""""is_U"="!"is_U;"}"
![Page 52: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/52.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
Apache SystemMLU"="rand(nrow(X),"r,"min"="01.0,"max"="1.0);""V"="rand(r,"ncol(X),"min"="01.0,"max"="1.0);""while(i"<"mi)"{""""i"="i"+"1;"ii"="1;""""if"(is_U)"""""""G"="(W"*"(U"%*%"V"0"X))"%*%"t(V)"+"lambda"*"U;""""else"""""""G"="t(U)"%*%"(W"*"(U"%*%"V"0"X))"+"lambda"*"V;""""norm_G2"="sum(G"^"2);"norm_R2"="norm_G2;""""""""R"="0G;"S"="R;""""while(norm_R2">"10E09"*"norm_G2"&"ii"<="mii)"{""""""if"(is_U)"{""""""""HS"="(W"*"(S"%*%"V))"%*%"t(V)"+"lambda"*"S;""""""""alpha"="norm_R2"/"sum"(S"*"HS);""""""""U"="U"+"alpha"*"S;""""""""}"else"{""""""""HS"="t(U)"%*%"(W"*"(U"%*%"S))"+"lambda"*"S;""""""""alpha"="norm_R2"/"sum"(S"*"HS);""""""""V"="V"+"alpha"*"S;""""""""}""""""R"="R"0"alpha"*"HS;""""""old_norm_R2"="norm_R2;"norm_R2"="sum(R"^"2);""""""S"="R"+"(norm_R2"/"old_norm_R2)"*"S;""""""ii"="ii"+"1;""""}""""""is_U"="!"is_U;"}"
SystemML: compile and run at scale no performance code needed!
![Page 53: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/53.jpg)
0
5000
10000
15000
20000
1.2GB (sparse binary)
12GB 120GB
Run
ning
Tim
e (s
ec)
R MLLib SystemML
>24h >24h
OO
M
OO
M
![Page 54: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/54.jpg)
Architecture
SystemML Optimizer
High-Level Algorithm
Parallel Spark
Program
![Page 55: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/55.jpg)
Architecture
High-Level Operations (HOPs) General representation of statements in the data analysis language
Low-Level Operations (LOPs) General representation of operations in the runtime framework
High-level language front-ends
Multiple execution environments
Cost Based
Optimizer
![Page 56: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/56.jpg)
Architecture
High-Level Operations (HOPs) General representation of statements in the data analysis language
Low-Level Operations (LOPs) General representation of operations in the runtime framework
High-level language front-ends
Multiple execution environments
Cost Based
Optimizer
data paralle
lism
![Page 57: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/57.jpg)
Architecture
High-Level Operations (HOPs) General representation of statements in the data analysis language
Low-Level Operations (LOPs) General representation of operations in the runtime framework
High-level language front-ends
Multiple execution environments
Cost Based
Optimizer
intra-m
odel
paralle
lism
![Page 58: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/58.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
TensorFramesCompute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
![Page 59: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/59.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
TensorFramesCompute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
![Page 60: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/60.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
TensorFramesCompute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
inter-model
paralle
lism
![Page 61: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/61.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
TensorSparkParameter Server on
Driver
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
![Page 62: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/62.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
TensorSparkParameter Server on
Driver
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
Compute Node
Executor JVM
Executor JVM
Executor JVM
data paralle
lism
![Page 63: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/63.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
CaffeOnSpark
![Page 64: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/64.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
CaffeOnSpark
data paralle
lism
![Page 65: Parallelization Stategies of DeepLearning Neural Network Training](https://reader030.vdocuments.mx/reader030/viewer/2022021507/5a65f53a7f8b9aaf638b674d/html5/thumbnails/65.jpg)
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me
Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk - don’t kill me - Disclaimer: This is a beta talk Romeo Kienzler
?