s9751 · • new with 10.1 - docker support eclipse ide for cuda . 9 nsight eclipse edition • use...
TRANSCRIPT
Aurelien Chartier & Steve Ulrich, March 19th, 2019
S9751:ACCELERATE YOUR CUDA DEVELOPMENT WITH LATEST DEBUGGING AND CODE ANALYSIS DEVELOPER TOOLS
2
AGENDA
Debugging Tools
Nsight Eclipse Edition
CUDA GDB
Nsight Visual Studio
CUDA Memcheck
Debug API
Steve Ulrich
Sanitizer API Aurelien Chartier
Steve Ulrich, March 19th, 2019
DEBUGGING TOOLS
4
WHO ARE YOU?
Have you …
• Ever written a CUDA application?
• Developed in CUDA for…
• … over a year?
• … over five years?
• Windows or Linux?
• Desktop or Embedded?
5
WHAT ARE YOUR PREFERENCES?CLI vs. IDE?
How many spaces in a tab?
Big Endian vs. Little Endian?
6
DEBUGGING STYLES
Before using a debugger:
Reboot (“you may have a future in tech support ☺”)
Google the error message
Ask for help
printf()
assert()
And eventually…
Break out the debugger
7
NVIDIA DEBUGGER PORTFOLIO
IDE Tools
Nsight Eclipse Edition
Nsight Visual Studio
CLI Tools
CUDA GDB
CUDA MEMCHECK
Development
Libraries
CUDA Debug API
8
NSIGHT ECLIPSE
• Install Nsight Eclipse plugins in your own Eclipse environment
• Supported in Eclipse version 4.7/4.8/4.9 for CUDA 10.1 (4.10 is in validation)
• Full featured IDE to edit, build and debug CUDA applications
• NVCC build integration to cross compile for various target platforms(x86/L4T/Drive Linux/Drive QNX).
• Debugger - Seamless and simultaneous debugging of both CPU and GPU code using CUDA GDB
• New with 10.1 - Docker Support
Eclipse IDE for CUDA
9
NSIGHT ECLIPSE EDITION
• Use the Drive OS Docker images from NVIDIA GPU Cloud to build the CUDA projects using Nsight EE.
• Nsight EE plugins supports mounting the project source files from the host to docker container and build the projects using CUDA toolkit/toolchains available inside docker container. No CUDA toolkit installation is required on the host.
• You can choose the docker image to use when creating the project or in preference page.
• Remote debug applications from Nsight EE using cuda-gdb inside docker container..
Docker Support
10
CUDA-GDB
Command line source and assembly (SASS) level debugger
Simultaneous CPU and GPU debugging
Inspect and modify memory, register, variable state
Control program execution
Support for multiple GPUs, multiple contexts, multiple kernels
11
CUDA-GDB
12
EVER HAD A CRASH AND WISHED YOU’D BEEN DEBUGGING?$ CUDA_DEVICE_WAITS_ON_EXCEPTION=1
13
EVER HAD A CRASH AND WISHED YOU’D BEEN DEBUGGING?$ CUDA_DEVICE_WAITS_ON_EXCEPTION=1
14
WHAT ABOUT CORE DUMPS?$ CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1
15
WHAT ABOUT CORE DUMPS?$ CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1
16
NSIGHT VISUAL STUDIO
Native CUDA C/C++ GPU Debugging
Source-correlated assembly debugging (SASS / PTX / SASS+PTX)
Data breakpoints for CUDA C/C++ code
Expressions in Locals, Watch and Conditionals
Visual Studio IDE for CUDA
17
NSIGHT VISUAL STUDIO
18
CUDA-MEMCHECK
Four tools in One:
Memory Checker
Race Condition Checker
Initialization Checker
Synchronization Checker
19
CUDA DEBUG API
ABI Support
Exception Reporting
Attach and Detach
Runtime control
State Inspection
Enable 3rd-party Debuggers
20
WHERE TO GET THE TOOLS?https://developer.nvidia.com/tools-overview
Aurelien Chartier, March 19th, 2019
CODE ANALYSIS – SANITIZER API
22
EVER…
• Used cuda-memcheck but needed more control?
• Needed a tool that let you analyze memory access patterns in your code?
• Wished for a PIN-like binary instrumentation tool for CUDA?
23
INTRODUCING THE SANITIZER APIReleased in CUDA 10.1
Host
Target
Memory
allocationsKernel launch
Device code
Stream sync
Callback host function
Callback device
function
Sanitizer callback API
Sanitizer patching API
24
SANITIZER API USAGE
static void ApiTrackerCallback(void* userdata,Sanitizer_CallbackDomain domain,Sanitizer_CallbackId cbid,const void* cbdata)
{auto* pCallbackData = (Sanitizer_CallbackData*)cbdata;auto returnValue = *(cudaError_t*)pCallbackData->functionReturnValue;
std::cout << "API call to " << pCallbackData->functionName <<" (return code "<< returnValue << ")" << std::endl;
}
Callback API
1. Write a callback function using Sanitizer_CallbackFunc function signature
25
SANITIZER API USAGE
2. Use sanitizerSubscribe to create a sanitizer subscriber
3. Use one of the sanitizerEnableCallback functions to enable/disable callbacks
As in the CUPTI API, only one subscriber can be active at any point.
Callback API
sanitizerSubscribe(&handle, ApiTrackerCallback, userdata);
sanitizerEnableDomain(1, handle, SANITIZER_CB_DOMAIN_RUNTIME_API);
26
API TRACKER
Memory
allocationsKernel launch
Device code
Stream sync
ApiTrackerCallback
Target
27
API TRACKER
28
SANITIZER API USAGE
1. Write patches as __device__ functions using the SanitizerCallback function signature
Patching API: device side
extern "C" __device__ __noinline__SanitizerPatchResult MemoryAccessCallback(
void* userdata, uint64_t pc, void* ptr,uint32_t accessSize, uint32_t flags)
{auto* pTracker = (MemoryAccessTracker*)userdata;uint32_t old = atomicAdd(&(pTracker->currentEntry), 1);
MemoryAccess& access = pTracker->accesses[old];access.address = (uint64_t)(uintptr_t)ptr;access.accessSize = accessSize;access.flags = flags;
return SANITIZER_PATCH_SUCCESS;}
29
SANITIZER API USAGE
2. Compile patches in a cubin or fatbin using new PTXAS option –-compile-as-tools-patch
Patching API: device side
$ nvcc --cubin --keep-device-functions –Xptxas –-compile-as-tools-patch MemoryTrackerPatches.cu –o MemoryTrackerPatches.cubin
30
SANITIZER API USAGE
1. Load patches in a CUDA context using sanitizerAddPatches
2. Mark which instructions should be patched using sanitizerPatchInstructions
3. Once all instructions have been marked, use sanitizerPatchModule to patch a given module
4. Use sanitizerSetCallbackData to specify user data passed to callbacks
Patching API: host side
sanitizerAddPatchesFromFile("MemoryTrackerPatches.cubin", ctx);
sanitizerPatchInstructions(SANITIZER_INSTRUCTION_MEMORY_ACCESS, module, "MemoryAccessCallback");
sanitizerPatchModule(pModuleData->module);
sanitizerSetCallbackData(stream, userdata);
31
MEMORY TRACKER
Module load Kernel launch
Device code
Stream sync
1. Load patches
2. Mark memory accesses to
be patched
3. Patch module
MemoryAccessCallback
Target
Host
4. Set userdata Retrieve userdata
32
MEMORY TRACKER
33
WHERE TO GET THE TOOL• Released in CUDA 10.1 under extras/Sanitizer folder
• Support for Windows, Linux (x64 only), OSX
• Support for Maxwell and newer GPUs
• Missing features:
• CDP (partial support)
• CUDA graphs
• Atomics on short types
• Get help with either:
• https://devtalk.nvidia.com/default/board/373/compute-sanitizer-api/
34
SANITIZER SAMPLES
Find the full samples code for the ApiTracker and MemoryTracker at:
https://github.com/NVIDIA/compute-sanitizer-samples
35
DEVELOPER TOOLS AT GTC19
Talks:S9751: Accelerate Your CUDA Development with Latest Debugging and Code Analysis Developer Tools, Tue @9amS9866 - Optimizing Facebook AI Workloads for NVIDIA GPUs, Tue @9amS9345: CUDA Kernel Profiling using NVIDIA Nsight Compute, Tue @1pmS9661: Nsight Graphics - DXR/Vulkan Profiling/Vulkan Raytracing, Wed @10amS9503: Using Nsight Tools to Optimize the NAMD Molecular Dynamics Simulation Program, Wed @1pm
Hands-on labs:L9102: Jetson Developer Tools Training Lab, Mon @9am, 11:30amL9124: Debugging and optimizing CUDA applications with Nsight products on Linux training lab, Tue @8am, 10am
Connect with the Experts (where DevTools will be available):CE9123: CUDA & Graphics Developer Tools, Tue @2pm, Wed @3pmCE9137: Jetson Embedded Platform, Tue @12pm, 5pm, Wed @1pm, 4pm, Thu @12pm
Podium: Demos of DevTools products on Linux, DRIVE AGX & Jetson AGX at the showfloorTue @12pm – 7pmWed @12pm – 7pmThu @11am – 2pm