gpu ecosystem

29
Page 1 GPU Ecosystem Introduction & Case Study Ofer Rosenberg October 2013

Upload: ofer-rosenberg

Post on 05-Dec-2014

1.032 views

Category:

Technology


3 download

DESCRIPTION

This presentation describes the components of GPU ecosystem for compute, provides overview of existing ecosystems, and contains a case study on NVIDIA Nsight

TRANSCRIPT

Page 1: GPU Ecosystem

Page 1

GPU Ecosystem Introduction & Case Study

Ofer Rosenberg

October 2013

Page 2: GPU Ecosystem

Page 2

Content

GPU Ecosystem

Ecosystem on Mobile/Embedded Platforms

NSIGHT - Tools case study

Libraries

Page 3: GPU Ecosystem

Page 3

Product

GPU Ecosystem

Software Product Development cycle:

The GPU Ecosystem role is to support, speedup, and

improve this cycle for GPU Compute

Design

Write Code

Debug

Profile

Page 4: GPU Ecosystem

Page 4

GPU Ecosystem

Support writing code by:

IDE integration – Compiler, Parser, Wizards

Libraries: Math (BLAS, IPP-like, Matrix, etc.),

STL-like (Thrust, BOLT)

Support Debugging by:

IDE integration of the debugger (preferred)

Provide usable execution control (breakpoints, pause/resume, etc.)

Providing reliable memory view of various address spaces

Support Profiling by:

Provide two levels of profiling: System Tracing and Kernel Profiling

System Tracing - quick highlighting of hotspots and device optimal access

Statistical and TimeLine-based Kernel Profiling (using perf. counters)

Design

Write Code

Debug

Profile

Page 5: GPU Ecosystem

Page 5

Ecosystem on

Mobile/Embedded Platforms

Page 6: GPU Ecosystem

Page 6

ARM MALI

Part of ARM SoC

OpenCL 1.1Full Profile (Linux, Android)

Renderscript (Android only)

OpenCL SDK – Samples, Tutorials, etc.

No GPU debugging capability

ARM DS-5 (Developer Suite 5)

Eclipse IDE integration

Compiler, Debugger (CPU only)

System Trace – CPU & GPU

Deep Profiling - CPU & GPU

Page 7: GPU Ecosystem

Page 7

Intel Haswell GPU

Part of Haswell (CPU & GPU)

OpenCL 1.2 Full Profile

Windows only for now (Linux @ alpha stage)

OpenCL SDK

Samples

Tools: Kernel Builder, VS/Eclipse Integration, Offline Compiler, GDB support (CPU Only)

No GPU debugging capability

VTune Amplifier XE supports OpenCL (CPU & GPU)

System level tracing (Application, Memory, Kernel launch)

Kernel Profiling

Page 8: GPU Ecosystem

Page 8

Intel BayTrail platform (Atom)

BayTrail < 13W, BayTrail-M < 6.5W

Vallyview SoC (Z37xx)

GPU is based on Gen7 (same arch as IvyBridge)

Same as previous slide:

OpenCL 1.2 (windows only for now)

OpenCL SDK

VTune support

System level tracing

Kernel Profiling

Page 9: GPU Ecosystem

Page 9

NVIDIA Tegra 5 ? (Codename: Logan)

Disclaimer: Logan is due early 2014. Part of the information is speculations

Development Boards and Samples available to selected customers

Logan SoC – 2W

ARM CPU A15 4+1 :speculated

Kepler based GPU : verified

CUDA Support : verified

CUDA SDK – Dozens of samples

CUDA Libraries: Thrust, cuBLAS, cuNVPP, etc.

NSIGHT : speculated

System Trace

Profiling, Debugging

Page 10: GPU Ecosystem

Page 10

NSIGHT TOOLS CASE STUDY

Design

Write Code

Debug

Profile

Page 11: GPU Ecosystem

Page 11

Nsight Highlights

“NVIDIA® Nsight™ is the ultimate development platform for heterogeneous

computing”

( Taken from Nsight page )

IDE integration

Windows – integration with Visual Studio

Linux – specialized Eclipse version

Debugging , System Trace , Profiling

Graphics (DX, OpenGL)

Computing (OpenCL, CUDA, C++ AMP)

Profiling only on CUDA kernels

Debug/Trace/Profile Information is highly shaped

Highly efficient information fields, windows, diagrams

Feedback from professional users is noticed

Page 12: GPU Ecosystem

Page 12

Debugging

Much more than “just integrated” with the IDE

Shaped windows showing valuable info

Assembly (GPU!)

Variables across

all warps Visible layout of the stopped thread

Page 13: GPU Ecosystem

Page 13

Debugging – Eclipse edition

Seems that Eclipse integration is deeper than Visual Studio

Unified CPU / GPU Debugging

Simultaneous visibility into both CPU and GPU state

Multi-GPU support

Slides from: “CUDA Development Using NVIDIA Nsight, Eclipse Edition” by David Goodwin, SC12

Full GPU debugging

Set kernel breakpoints

Single-step, run until, etc.

View values across multiple GPU

threads at the same time

Examine thread, warp, block state

Source and assembly level debugging

Page 14: GPU Ecosystem

Page 14

System Trace

Page 15: GPU Ecosystem

Page 15

Kernel Profiling

Choose a kernel to profile

Skip N kernels, Profile M kernels

Choose “experiments”

Experiment - Types of profiling/analysis

NVIDIA runs each kernel launch dozens of times with the same data

Page 16: GPU Ecosystem

Page 16

Profiling Results

Experiment list

Each experiment is a tabbed window

Profiling information is shaped in graphs,

pie charts, diagrams, etc.

Taking HW counters and shaping them to easy-

to-understand graphics

Information targets known HW bottlenecks, Code

inefficiencies, etc.

Amazingly shaped…

Page 17: GPU Ecosystem

Page 17

Profiling Results

The information provides a quick & easy methodic way to identify the performance

bottlenecks

1 2

3 4

Page 18: GPU Ecosystem

Page 18

Eclipse Edition - Source Code Editor

Project Templates

CUDA code highlighting

CUDA aware refactoring

CUDA aware code completion and inline help

Page 19: GPU Ecosystem

Page 19

LIBRARIES EXAMPLES

Page 20: GPU Ecosystem

Page 20

CUDA Libraries – Part of the SDK

cuFFT

cuBLAS

cuRAND

cuSPARSE

NPP (like IPP)

Math Library

Thrust (next slide)

Page 22: GPU Ecosystem

Page 22

OPENCL LIBRARIES

Page 23: GPU Ecosystem

Page 23

CLPP

OpenCL Data Parallel Primitives Library (similar to thrust)

Source : https://code.google.com/p/clpp/

7 committers, last commit 1.5Y ago

Page 24: GPU Ecosystem

Page 24

OpenCL BLAS

OpenCL BLAS

http://openclblas.sourceforge.net/

Code is available here (GPLv2):

http://sourceforge.net/projects/openclblas/

Page 25: GPU Ecosystem

Page 25

ViennaCL

BLAS implementation

http://viennacl.sourceforge.net/

Looks very promising

Page 26: GPU Ecosystem

Page 26

REFERENCES

Page 27: GPU Ecosystem

Page 27

Platform links:

ARM

Developer site : http://malideveloper.arm.com

OpenCL tracing : http://malideveloper.arm.com/develop-for-mali/tools/mali-graphics-debugger/

DS-5 suite : http://www.arm.com/products/tools/software-tools/ds-5/index.php

OpenCL SDK : http://malideveloper.arm.com/develop-for-mali/sdks/mali-opencl-sdk/

OpenCL developer guide:

Online: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0538e/index.html

PDF: http://infocenter.arm.com/help/topic/com.arm.doc.dui0538e/DUI0538E_mali_t600_opencl_dg.pdf

NVIDIA

http://www.anandtech.com/show/7169/nvidia-demonstrates-logan-soc-mobile-kepler

http://www.slashgear.com/nvidia-tegra-logan-detailed-with-game-changing-cuda-integration-19274630/

http://www.ubergizmo.com/2013/07/nvidia-tegra-5-release-date-specs-news/

Page 28: GPU Ecosystem

Page 28

Links:

Intel

OpenCL sdk http://software.intel.com/en-us/vcsource/tools/opencl-sdk

GPA http://software.intel.com/en-us/vcsource/tools/intel-gpa

vTune support in OpenCL http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe-getting-started-with-opencl-

performance-analysis-on-intel-hd-graphics

http://www.theinquirer.net/inquirer/news/2266966/intel-releases-opencl-sdk-for-windows-and-linux

Haswell Linux support: http://www.phoronix.com/scan.php?page=news_item&px=MTA3NDc

OpenCL “Beignet” – open source linux compiler :

http://software.intel.com/en-us/forums/topic/402118

http://linux.slashdot.org/story/13/04/16/014233/intel-releases-new-opencl-implementation-for-gnulinux

ATOM BayTrail:

http://arstechnica.com/gadgets/2013/02/intel-gets-aggressive-with-new-smartphone-and-tablet-chips/

http://www.anandtech.com/show/7314/intel-baytrail-preview-intel-atom-z3770-tested

http://www.tomshardware.com/reviews/bay-trail-celeron-j1750-performance,3614-6.html

http://software.intel.com/en-us/forums/topic/476221

http://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors#.22Bay_Trail.22_.2822_nm.29

Page 29: GPU Ecosystem

Page 29

NSIGHT Links

http://www.nvidia.com/object/nsight.html

https://developer.nvidia.com/nsight-visual-studio-edition-videos

https://developer.nvidia.com/developer-webinars

http://on-demand.gputechconf.com/supercomputing/2012/presentation/SB006-Goodwin-

CUDA-Development-Nsight.pdf

http://on-demand.gputechconf.com/gtc/2013/presentations/S3011-CUDA-Optimization-

With-Nsight-VSE.pdf