evaluation of modern gpgpu technologies for image … · • kernel debugging • profiling • ide...
TRANSCRIPT
![Page 1: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/1.jpg)
JOACHIM MEYER
IMAGE PROCESSING
GPGPU ENGINEER
EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE PROCESSING
![Page 2: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/2.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 2
(TOO?) MANY DIFFERENT GPGPU
PROGRAMMING MODELS / APIS
![Page 3: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/3.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 3
(TOO?) MANY DIFFERENT GPGPU
PROGRAMMING MODELS / APIS
Which one fits thisawesome new project?
![Page 4: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/4.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 4
AGENDA
▪ SELECTION OF COMPARED APIS
▪ EVALUATION SETUP
▪ PERFORMANCE
▪ USABILITY
▪ PLATFORM INDEPENDENCE
▪ CONCLUSION
▪ FUTURE PROSPECTS
![Page 5: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/5.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 5IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 5
Selection of APIs
BASICS
![Page 6: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/6.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 6IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 6
Test Project setup
BASICS • Targeting all 4 APIs + CPU reference implementation• Targeted devices: CPUs & GPUs• OSs: Windows & Linux 64bit• Implementations:
• CUDA 10.1• Vulkan 1.1• OpenCL 1.2• ComputeCpp (Win) & hipSYCL (Linux)
• Algorithms for polarization camera image processing
![Page 7: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/7.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 7IWOCL & SYCLcon ‘20 | 10/04/2020 | Slide 7
PERFORMANCE
It’s comparable.
![Page 8: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/8.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 8IWOCL & SYCLcon ‘20 | 10/04/2020 | Slide 8
USABILITY
How hard is it?
![Page 9: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/9.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 9
SINGLE-SOURCE MODELS DOMINATE FAST DEVELOPMENT
WHAT’S THE IMPLEMENTATION COST?
CUDA SYCL OpenCL Vulkan
LoC basic setup 4 5 6 65
LoC realistic setup 25 27 34 128 (+ 25 GLSL→SPIRV)
LoC / new kernel 4 5 6 11
C++ kernels ✔ ✔ ✔
Implicit asynchronity ✔ ✔ ✔
Taskgraph ✔ ✔
![Page 10: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/10.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 10
TOOLS MAKE DEVELOPMENT EASIER
ANY TOOLS TO HELP?
CUDA
• Solid dev tooling:
• kernel debugging
• profiling
• IDE integration
OpenCL
• Mostly vendor specific dev tools
• LPGPU² CodeXL: generalization of AMD project
Vulkan
• Mainly graphics focused tooling
• Validation layers
• Emulator (Talvos)
SYCL
• Hardly any specific tools, but native OpenCL / HIP tools usable
• Host-device enables native IDE debugging
![Page 11: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/11.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 11
LIBRARIES REDUCE DEVELOPMENT COST
LIBRARIES?
CUDA
• Many optimized libraries
• FFT, BLAS, image processing, …
OpenCL
• Number of libraries with some device-specific optimization
• FFT, BLAS, DGEMM, image processing, ...
Vulkan
• Hardly any Compute specific libraries
SYCL
• Some libraries
• BLAS, DNN, RNG, Parallel STL, image processing
• Native (OpenCL / HIP) libraries usable
![Page 12: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/12.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 12
APPLICATION ADOPTION AS INDICATOR FOR COMMUNITY SUPPORT
HELP ANYONE?
CUDA
• Widely used by scientists and application devs
• De-facto standard in ML libraries
• SO Questions: 12.380
OpenCL
• Wide adoption in consumer applications
• Adobe Creative Cloud
• Final Cut Pro
• SO Questions: 5.040
Vulkan
• Increasing adoption for mobile device support / combined with graphics
• Adobe Premier Rush
• OcataneRender
• SO Questions: 1.020
SYCL
• Few applications known
• Tensorflow
• Eigen
• SO Questions: 28
Stack Overflow Question tags counted from 2. March 20
![Page 13: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/13.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 13
EACH API HAS ITS OWN (KERNEL) COMPILATION WORKFLOW
HOW DOES THE CODE COME TO LIFE?
CUDA source
Clang nvcc
PTX
OpenCL C kernel
clspvto
SPIR(-V)
SPIR
OpenCLdriver
CUDA runtime
Executable
Vulkan GLSL kernel
shaderc
SYCL source
ComputeCpp
hipSYCL
Vulkan driver
SPIR-V
libshaderc
glslanghipSYCL DPC++
![Page 14: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/14.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 14
VARIETY OF IMAGE FORMATS WITH VARIOUS DATA TYPES
HOW TO HANDLE DYNAMIC DATA TYPES?
CUDA OpenCL VulkanSYCL
Generic programming Generic programming (Dynamic) online compilation with required data type as macro
Preprocessor programming to dispatch temporary data types
Online / offline compilation with required data type as macro / in shader names
Preprocessor programming to dispatch temporary data types
CUDA SYCL
![Page 15: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/15.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 15IWOCL & SYCLcon ‘20 | 10/04/2020 | Slide 15
PLATFORM
INDEPENDENCE
Can it target XYZ?
![Page 16: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/16.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 16
WHAT HARDWARE CAN BE TARGETED USING WHICH VERSION?
CAN IT TARGET XYZ?
CUDA SYCL OpenCL Vulkan
Most recent version 10.2 1.2.1 2.2 1.2
Nvidia 10.2 1.2.1 1.2 1.2
AMD HIP 1.2.1 2.0 1.2
Intel 1.2.1 2.1 1.2
ARM 1.2.1 2.1 1.2
Windows ✔ ✔ ✔ ✔
Linux ✔ ✔ ✔ ✔
macOS ✔ ✔ ✔
Android ✔ ✔
CPU ✔ ✔ ✔
FPGAs ✔ ✔
![Page 17: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/17.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 17
PROJECTS INCREASING PORTABILITY OF THE APIS
PORTABILITY INITIATIVES
clvkclspv
SwiftShader
HIPhipCL
CUDA-on-CL
CLonD12
![Page 18: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/18.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 18IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 18
So which API should be
used?
CONCLUSIONCUDA OpenCL VulkanSYCL
Single-source programming
Highly optimized and powerful libraries and tools
Vendor lock-in acceptable?(Maybe use HIP instead?)
Single-source programming
Multi-platform (incl. FPGAs,..)
Tools for underlying implementation usable
Emerging SYCL-specific tool and library support
Cross-platform (incl. FPGAs,..)
Mature libraries
Big community
Not-up-to-date implementations
Fully OS and GPU-vendor independent
High setup costbut possibility to optimize
Lack of compute specific tooling & libraries
Full decision matrix: doi.org/10.1145/3388333.3388645
![Page 19: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/19.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 19IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 19
FUTURE
PROSPECTS
![Page 20: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/20.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 20
SOME POSSIBLE DEVELOPMENTS
WHAT‘S UP NEXT?
• Maturing and optimization of implementations
• Extended hardware and OS support
• Removal of OpenCL as conformance required backend
• Specific tooling and libraries
• News @ IWOCL• SYCL-on-Vulkan?
OpenCL VulkanCUDA SYCL
• Extended ARM & data-center support
• Continuousoptimization and feature updates
• Fast support for newGPU features
• HIP porting to Windows?
• (Hopefully) improved vendor support with OpenCL Next
• Updated to new hardware features
• Higher-level kernel language support
• News @ IWOCL
• Continued wide support
• Extended compute capabilities to serve as portability backend for other APIs
• Compute specific libraries & tools
• Fast support for new GPU features
![Page 21: EVALUATION OF MODERN GPGPU TECHNOLOGIES FOR IMAGE … · • kernel debugging • profiling • IDE integration OpenCL •Mostly vendor specific dev tools • LPGPU² CodeXL: generalization](https://reader034.vdocuments.mx/reader034/viewer/2022042311/5ed8d4b96714ca7f4768a5cb/html5/thumbnails/21.jpg)
IWCOL & SYCLcon ’20 | 10/04/2020 | Slide 21
© Copyright STEMMER IMAGING AG. All rights reserved. All texts, images, graphics, sound-, video- and animation files, as well as their
arrangements are copyright protected. Reprint, processing and duplication for commercial purposes or use on websites are forbidden.
Some STEMMER IMAGING pages contain images that are subject to copyright of the respective owner.
THANK YOU VERY MUCH
FOR YOUR ATTENTION
JOACHIM MEYER
STEMMER IMAGING AG
STEMMER-IMAGING.COM
JOAMEYER.DE