general-purpose computation on graphics hardware
TRANSCRIPT
General-Purpose Computation on General-Purpose Computation on Graphics HardwareGraphics Hardware
IntroductionIntroduction
David Luebke University of Virginia
Course IntroductionCourse Introduction
• The GPU on commodity video cards has evolved into an extremely flexible and powerful processor– Programmability– Precision– Power
• This course will address how to harness that power for general-purpose computation
Motivation: Computational PowerMotivation: Computational Power
• GPUs are fast…– 3 GHz Pentium4 theoretical: 6 GFLOPS, 5.96 GB/sec peak– GeForceFX 5900 observed: 20 GFLOPs, 25.3 GB/sec peak
• GPUs are getting faster, faster– CPUs: annual growth 1.5× decade growth 60× – GPUs: annual growth > 2.0× decade growth > 1000
Courtesy Kurt Akeley,Ian Buck & Tim Purcell, GPU Gems (see course notes)
Motivation:Computational PowerMotivation:Computational Power
Courtesy Naga Govindaraju
GPU
CPU
An Aside: Computational PowerAn Aside: Computational Power
• Why are GPUs getting faster so fast?– Arithmetic intensity: the specialized nature of GPUs makes it
easier to use additional transistors for computation not cache– Economics: multi-billion dollar video game market is a
pressure cooker that drives innovation
Motivation:Flexible and preciseMotivation:Flexible and precise
• Modern GPUs are deeply programmable– Programmable pixel, vertex, video engines– Solidifying high-level language support
• Modern GPUs support high precision– 32 bit floating point throughout the pipeline– High enough for many (not all) applications
Motivation:The Potential of GPGPUMotivation:The Potential of GPGPU
• The power and flexibility of GPUs makes them an attractive platform for general-purpose computation
• Example applications range from in-game physics simulation to conventional computational science
• Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor
The Problem:Difficult To UseThe Problem:Difficult To Use
• GPUs designed for and driven by video games– Programming model is unusual & tied to computer graphics– Programming environment is tightly constrained
• Underlying architectures are:– Inherently parallel– Rapidly evolving (even in basic feature set!)– Largely secret
• Can’t simply “port” code written for the CPU!
Course goalsCourse goals
• A detailed introduction to general-purpose computing on graphics hardware
• We emphasize:– Core computational building blocks– Strategies and tools for programming GPUs– Tips & tricks, perils & pitfalls of GPU programming
• Several case studies to bring it all together
Why a SIGGRAPH Course?Why a SIGGRAPH Course?
• Why SIGGRAPH, instead of (say) Supercomputing?– Many graphics applications stand to benefit from GPGPU
• “Hot topic” case studies: tone mapping, level sets, fluids• Keeping computation on-card!
– Many graphics applications strive for visual plausibility rather than rigorous scientific realism
• Better tolerate GPU limitations in precision, memory• Well suited as GPGPU “early adopters”
– GPGPU programming still requires expertise of SIGGRAPH audience
Course PrerequisitesCourse Prerequisites
• We assume– Familiarity with interactive graphics and computer graphics
hardware – Ideally, some experience programming vertex & pixel shaders
• Target audience– Researchers interested in GPGPU– Graphics and games developers interested in incorporating
these techniques into their applications– Attendees wishing a survey of this exciting new field
Course TopicsCourse Topics
• GPU building blocks
• Languages and tools
• Effective GPU programming
• GPGPU case studies
Course Topics: DetailsCourse Topics: Details
• GPU building blocks– Linear algebra– Sorting and searching– Database operations
• Languages and tools– High-level languages– Debugging tools
Course Topics: DetailsCourse Topics: Details
• Effective GPU programming– Efficient data-parallel programming – Data formatting & addressing– GPU computation strategies & tricks
• Case studies in GPGPU Programming– Physically-based simulation on GPUs– Ray tracing & photon mapping on GPUs– Tone mapping on GPUs– Level sets on GPUs
SpeakersIn Order of AppearanceSpeakersIn Order of Appearance
• David Luebke, University of Virginia
• Mark Harris, NVIDIA
• Jens Krüger, TU-Munich
• Tim Purcell, Stanford (NVIDIA)
• Naga Govindaraju, University of North Carolina
• Ian Buck, Stanford
• Cliff Woolley, University of Virginia
• Aaron Lefohn, University of California Davis
Luebke
Harris
Krüger
Purcell
Course Schedule:GPU Building BlocksCourse Schedule:GPU Building Blocks8:30 Introduction
Welcome, overview, the graphics pipeline
9:00 Mapping computational concepts to the GPU
Streaming, Resources, CPU-GPU analogies, branching
9:20 Linear algebra
Representations, operations, example algorithms
9:55 Sorting & searching (part 1)
Bitonic sort, binary search
10:15 Break
Purcell
Govindaraju
Buck
Purcell
Course Schedule:Languages & ToolsCourse Schedule:Languages & Tools10:30 Sorting & searching (part 2)
Nearest-neighbor search
10:45 Database operations
Queries, boolean predicates, aggregation
11:15 High-level languages
Cg/HLSL/GLslang, Sh, Brook
11:45 Debugging tools
imdebug, DirectX/OpenGL shader IDEs, ShadeSmith
12:15 Lunch break
Woolley
Lefohn
Buck
All
Course Schedule:Effective GPU ProgrammingCourse Schedule:Effective GPU Programming1:45 Efficient data-parallel GPU programming
Computational frequency, profiling, load balancing
2:15 Data formatting & addressing
Memory layout, data structures
2:45 GPU Computation Strategies & Tricks
Precision, performance, scatter, branching
3:15 Q & A
Questions for the speakers?
3:30 Break
Harris
Woolley
Lefohn
Purcell
Course Schedule:GPGPU Case StudiesCourse Schedule:GPGPU Case Studies3:45 Physically-based simulation on GPUs
Reaction-diffusion, fluids, clouds
4:10 Tone mapping on GPUs
High-dynamic range images, tone mapping
4:35 Level sets on GPUs
Streaming level sets, visualization, segmentation
5:00 Global illumination on GPUs
Ray tracing, photon mapping
5:30 Wrap!
GPU Fundamentals:The Graphics PipelineGPU Fundamentals:The Graphics Pipeline
• A simplified graphics pipeline– Note that pipe widths vary– Many caches, FIFOs, and so on not shown
GPUCPU
ApplicationApplication TransformTransform RasterizerRasterizer ShadeShade VideoMemory
(Textures)
VideoMemory
(Textures)VerticesVertices
(3D)(3D)Xformed,Xformed,
LitLitVerticesVertices
(2D)(2D)
FragmentsFragments(pre-pixels)(pre-pixels)
FinalFinalpixelspixels
(Color, Depth)(Color, Depth)
Graphics StateGraphics State
Render-to-textureRender-to-texture
GPU Fundamentals:The Modern Graphics PipelineGPU Fundamentals:The Modern Graphics Pipeline
• Programmable vertex processor!
• Programmable pixel processor!
GPUCPU
ApplicationApplication VertexProcessor
VertexProcessor RasterizerRasterizer Pixel
ProcessorPixel
ProcessorVideo
Memory(Textures)
VideoMemory
(Textures)VerticesVertices
(3D)(3D)Xformed,Xformed,
LitLitVerticesVertices
(2D)(2D)
FragmentsFragments(pre-pixels)(pre-pixels)
FinalFinalpixelspixels
(Color, Depth)(Color, Depth)
Graphics StateGraphics State
Render-to-textureRender-to-texture
VertexProcessor
VertexProcessor
FragmentProcessorFragmentProcessor
GPU Pipeline: TransformGPU Pipeline: Transform
• Vertex Processor (multiple operate in parallel)– Transform from “world space” to “image space”– Compute per-vertex lighting
GPU Pipeline: RasterizerGPU Pipeline: Rasterizer
• Rasterizer– Convert geometric rep. (vertex) to image rep. (fragment)
• Fragment = image fragment– Pixel + associated data: color, depth, stencil, etc.
– Interpolate per-vertex quantities across pixels
GPU Pipeline: ShadeGPU Pipeline: Shade
• Fragment Processors (multiple in parallel)– Compute a color for each pixel– Optionally read colors from textures (images)
Coming UpComing Up
• Next: Mapping computational concepts to the GPU
• Also coming up:– Core building blocks for GPGPU computing– Memory layout, data structures, and algorithms– Detailed advice on writing high performance GPGPU code– Lots of examples