bending the graphics pipeline

39
Beyond Programmable Shading Course ACM SIGGRAPH 2010 Bending the Graphics Pipeline Johan Andersson DICE

Upload: dice

Post on 07-May-2015

54.244 views

Category:

Documents


5 download

DESCRIPTION

Talk from SIGGRAPH 2010 and the Beyond Programmable Shading course Also see publications.dice.se for more material and other DICE talks.

TRANSCRIPT

Page 1: Bending the Graphics Pipeline

Beyond Programmable Shading CourseACM SIGGRAPH 2010

Bending the Graphics Pipeline

Johan Andersson

DICE

Page 2: Bending the Graphics Pipeline

Overview

• Give a taste of a few rendering techniques we are using & experimenting with how they interact, or would like to interact, with the graphics pipeline

• Tile-based Deferred Shading

• Morphological Antialiasing

• Analytical Ambient Occlusion

04/11/23 2Beyond Programmable Shading, SIGGRAPH 2010

Page 3: Bending the Graphics Pipeline

Beyond Programmable Shading CourseACM SIGGRAPH 2010

TILE-BASED DEFERRED SHADING

Page 4: Bending the Graphics Pipeline

Tile-based deferred shading

• Tile-based culling & lighting– Cull lights per screen-space tile– Lighting kernel runs per tile– Minimizes bandwidth/setup cost

• DX11: GPU compute shader – Covered in the course last year [Andersson09]

• PS3: SPU jobs– GPU renders gbuffer– SPU does light culling & full lighting evaluation for each pixel

04/11/23 4Beyond Programmable Shading, SIGGRAPH 2010

Page 5: Bending the Graphics Pipeline

• Standard phong• Metallic• Skin• Translucent

Multiple deferred lighting models

Beyond Programmable Shading, SIGGRAPH 2010 504/11/23

Page 6: Bending the Graphics Pipeline

Working with tiles

• Tile culling optimizations– Cull lights & shadows with tile normal cone– Detect tile specular=0– Detect tile lighting model

• Tile lighting kernel permutations– Specular on/off– Lighting models– More in the future

Beyond Programmable Shading, SIGGRAPH 2010 604/11/23

Page 7: Bending the Graphics Pipeline

SPU-based Deferred Shading

• Ported DX11 compute shader to SPU job– Offloads PS3 GPU– SPU processing in parallel with GPU rendering– 32x16 pixel tiles

• Explicit SoA vectorization instead of implicit– C/C++ on SPU - HLSL on GPU– Not a problem for such a relative small kernel– But not ideal data-parallel programming model

Beyond Programmable Shading, SIGGRAPH 2010 704/11/23

Page 8: Bending the Graphics Pipeline

SPU vs GPU architecture

• 6 execution contexts vs 1+ million (each pixel)• Explicit SIMD vs implicit SIMD• C/C++ vs HLSL• Explicit async DMA vs implicit latency hiding

• What can we learn?

Beyond Programmable Shading, SIGGRAPH 2010 804/11/23

Page 9: Bending the Graphics Pipeline

Issues & challenges going forward

• More lighting models– SIMD & branching efficiency

• Transparent decal surfaces & volumes– Fixed function blending doesn’t work well with deferred

• Higher-quality antialiasing

Beyond Programmable Shading, SIGGRAPH 2010 904/11/23

Page 10: Bending the Graphics Pipeline

Flexible lighting models

• Want both more & more flexible models:– Custom gbuffer layout per material– Quality & performance tradeoffs

• Examples:– Hair / anisotropic materials

• Requires more lighting model parameters in gbuffer– Foliage

• Massive overdraw with alpha-tested simple shaders, few parameters • Write to as simple gbuffer as possible to reduce ROP/bandwidth bottleneck

– Skin • Sub-surface scattering approximation

Beyond Programmable Shading, SIGGRAPH 2010 1004/11/23

Page 11: Bending the Graphics Pipeline

The SIMD efficiency problem

• Lighting models through dynamic branches

• GPU shader model can be problematic:– Increased register pressure = overall slower shader – Requires good screen-space SIMD coherency for

performance win

• Potential solutions:– Reshuffle pixels to improve coherency?

• Within each tile, sort pixels by model, compute lighting & then scatter back

– GRAMPS-style queing? [Sugerman09]• Attractive & powerful high-level programming model

Beyond Programmable Shading, SIGGRAPH 2010 1104/11/23

Alpha-tested foliage has far from ideal coherency

Page 12: Bending the Graphics Pipeline

Decals & deferred shading

• Decals blend selectively against gbuffer– Include:

• Diffuse albedo (gbuffer1.rgb)• Normal (gbuffer0.rgb)

– Want to include (but can’t in single pass):• Specular albedo (gbuffer1.a) • Specular smoothness (gbuffer0.a)

– Exclude:• Material id (can’t blend)• Object lighting (inherit from below surface)

• Fixed function blending doesn’t work well– Pixel shader can’t write out both alpha & blend factor!– Consoles doesn’t have blend mode per MRT– Linear blend doesn’t work for all components

Beyond Programmable Shading, SIGGRAPH 2010 1204/11/23

See Destruction Masking in Frostbite 2 using Volume Distance Fields [Kihl10] for more details about decal use case

Page 13: Bending the Graphics Pipeline

Need programmable blending

• Benefits:– Write out gbuffer alpha channels indepenently of blend factor– Treat channels & targets however you see fit – Non-linear blending & renormalizing blends– Can do overlapping dependent blending

• Read current normal, add bumps relative to it, write out

• What approach?– LRB-style pixel shader framebuffer read/modify/write [Lalonde09]

• Ideal general solution for developers• How to hide synchronization latency? Implicit / explicit?

– Blend shader • Yet another stage in a fixed pipeline• No R/M/W, not ideal

– More?

Beyond Programmable Shading, SIGGRAPH 2010 1304/11/23

Page 14: Bending the Graphics Pipeline

The deferred shading + MSAA problem

• Huge storage & bandwidth requirements with deferred– 1920 x 1080 x 5 x 4 x 4 = 165 MB– Doesn’t scale! Adding 1 bit of precision = 2x more memory

• 4x MSAA is not enough– Esp. for thin geometry in a distance

• Prohibitive performance and bandwidth in general with deferred shading– But don’t miss Andrew Lauritzen’s talk later in the course: Deferred Rendering for

Current and Future Rendering Pipelines

• There are alternatives to MSAA...

Beyond Programmable Shading, SIGGRAPH 2010 1404/11/23

Page 15: Bending the Graphics Pipeline

MLAA – Morphological Antialiasing

• Post-effect antialiasing• Introduced in [Reshetov09]

• Implementations:– Intel CPU reference implementation [Reshetov09]– Sony PS3 SPU implementation [Perthuis10]– GPU compute? [Biri10]

Beyond Programmable Shading, SIGGRAPH 2010 1504/11/23

Page 16: Bending the Graphics Pipeline

MLAA workings

Beyond Programmable Shading, SIGGRAPH 2010 1604/11/23

From [Reshetov09]

Page 17: Bending the Graphics Pipeline

MLAA comparisons (PS3)

Beyond Programmable Shading, SIGGRAPH 2010 1704/11/23

No AA

MLAA

Page 18: Bending the Graphics Pipeline

MLAA takeaways

• Awesome AA for still pictures

• Moving pictures good, but:– No sub-pixel information = edges snap to pixels– Doesn’t solve aliasing on fine detail geometry– Overall still a very good benefit!

• Focus/exclude effect based on framebuffer alpha & thresholds– Unique requirements per game/app– Not good to use on some UI, mark in alpha (or apply before)

• Variable post-effect, trade perf vs quality!

Beyond Programmable Shading, SIGGRAPH 2010 1804/11/23

Page 19: Bending the Graphics Pipeline

MLAA future (PC)

• GPU compute shader implementation

• Combine with MSAA & sub-pixel samples – Simple MSAA box filter downsampling is a big waste– Sort of similar to A Directionally Adaptive Edge Anti-

Aliasing Filter [Yang09]– A must to reduce the edge snapping of pure MLAA– Not fully clear how it should work (sample distribution)

Beyond Programmable Shading, SIGGRAPH 2010 1904/11/23

Page 20: Bending the Graphics Pipeline

Beyond Programmable Shading CourseACM SIGGRAPH 2010

AMBIENT OCCLUSION

Page 21: Bending the Graphics Pipeline

Current dynamic AO

• Horizon-based Ambient Occlusion – See [Bavoil09] for complete details

• Based on screen-space depth-buffer (SSAO)– Very high quality sampling– But only screen-space info is a big limitation– Creates false occlusion artifacts

• Render in half-res for improved performance– Bilateral upsampling + gaussian blur – Can also do dual-resolution to reduce artifacts

Beyond Programmable Shading, SIGGRAPH 2010 2104/11/23

Page 22: Bending the Graphics Pipeline

Horizon-based Ambient Occlusion

Beyond Programmable Shading, SIGGRAPH 2010 2204/11/23

False occlusion halo from thin geometryFalse occlusion halo from thin geometry

Page 23: Bending the Graphics Pipeline

HBAO limitations

• False halo occlusion artifacts around small geometry – Such as: fences & poles

– Extra visible when moving the camera

• Very noisy sampling for detailed zbuffers– Common with alpha-tested foliage

– Difficult sampling problem

Beyond Programmable Shading, SIGGRAPH 2010 2304/11/23

Page 24: Bending the Graphics Pipeline

Analytical Ambient Occlusion

Beyond Programmable Shading, SIGGRAPH 2010 2404/11/23

Page 25: Bending the Graphics Pipeline

HBAO vs AAO

Beyond Programmable Shading, SIGGRAPH 2010 2504/11/23

Page 26: Bending the Graphics Pipeline

Analytical Ambient Occlusion

• Using Ambient Occlusion Volumes– [McGuire10]

• Experimental implementation in Frostbite 2– With some good help from Morgan

McGuire and Louis Bavoil

• Geometry-based technique– Not screen-space!– Say what?

Beyond Programmable Shading, SIGGRAPH 2010 2604/11/23

Page 27: Bending the Graphics Pipeline

AOV idea

1. Extrude prism for each triangle (GS)– Extrusion distance is where occlusion=0

2. Rasterize primitives in prism– With depth-test enabled, near depth clip disabled– Finds visible points inside volume– Need to handle case with camera inside volume

3. Accumulate analytical occlusion contribution for visible pixels (PS)– Uses pixel normal & depth values from gbuffer– Subtractive blend

Beyond Programmable Shading, SIGGRAPH 2010 2704/11/23

Page 28: Bending the Graphics Pipeline

Beyond Programmable Shading, SIGGRAPH 2010 2804/11/23

HBAO

Page 29: Bending the Graphics Pipeline

Beyond Programmable Shading, SIGGRAPH 2010 2904/11/23

HBAOAOV

Page 30: Bending the Graphics Pipeline

AOV in practice

• Render geometry again in separate AO pass– Uses depth & normal buffer from deferred rendering– Half-res or lower with bilateral upsampling– Culling should consider extrusion distance

• Separate paths for dynamic & rigid objects– Can pre-compute rigid extruded AOV & reduce overdraw

• Doesn’t work with alpha-tested surfaces– Simulate with per-surface or per-triangle approx. coverage factor

Beyond Programmable Shading, SIGGRAPH 2010 3004/11/23

Page 31: Bending the Graphics Pipeline

Overdarkening (extra occlusion)

Beyond Programmable Shading, SIGGRAPH 2010 3104/11/23

Page 32: Bending the Graphics Pipeline

Varying overdraw with AO distance

Beyond Programmable Shading, SIGGRAPH 2010 3204/11/23

0.1 m 0.2 m 0.5 m

Page 33: Bending the Graphics Pipeline

AOV pros & cons

Pros:•Very high quality - close to raytracing ground truth•Noise free (when full res)•Perfectly stable with view changes•Supports arbitrary dynamic polygon soups

Cons:•Requires massive fillrate•Geometry cost•Overdarkening, may require content tweaks

Beyond Programmable Shading, SIGGRAPH 2010 3304/11/23

Page 34: Bending the Graphics Pipeline

AOV future optimizations

• Reduce the massive overdraw– Cull / restrict prisms that only extend out to empty air?– Clamp screen-space prism size

• Not correct, but practical tradeoff. HBAO does this

• More optimal prism geometry– GS is limited to triangle strip output – Precompute using quads for rigid objects

• Geometry LOD / mix with higher-order geometry representations– Also see AO volume texture & analytical capsule techniques [Hill10]

Beyond Programmable Shading, SIGGRAPH 2010 3404/11/23

Page 35: Bending the Graphics Pipeline

AOV takeaways

• Major improvement in visual quality compared to SSAO

• Interesting use of geometry & rasterization pipelines– Builds on existing HW-, SW- & content pipelines– Quite simple brute force drop-in (but not as simple as SSAO)

• Siggraph interactive framerates™ today, but lots of potential:– Performance highly dependent on occlusion distance– Optimizations / less brute force?– Use for high-end / reference / precompute / beauty shots initially

Beyond Programmable Shading, SIGGRAPH 2010 3504/11/23

Page 36: Bending the Graphics Pipeline

Conclusions

• New graphics pipeline usages are opened up with improved HW performance– Often not efficient to do with pure compute– Continue to give us more performance & bandwidth!

• We need to continue to break down some fixed graphics pipeline barriers

04/11/23 36Beyond Programmable Shading, SIGGRAPH 2010

Page 37: Bending the Graphics Pipeline

Acknowledgments

• Morgan McGuire• Louis Bavoil• David Luebke• Andrew Lauritzen• Robert Kihl• Christina Coffin• SCEE

Beyond Programmable Shading, SIGGRAPH 2010 3704/11/23

Page 38: Bending the Graphics Pipeline

Questions?

Beyond Programmable Shading, SIGGRAPH 2010 3804/11/23

email: [email protected]

blog: http://repi.se

twitter: @repi

For more DICE talks:

http://publications.dice.se

Page 39: Bending the Graphics Pipeline

References• [Andersson09] Johan Andersson, “Parallel Graphics in Frostbite - Current & Future”, Beyond

Programmable Shading Course – Siggraph 2009 http://s09.idav.ucdavis.edu/• [Lalonde09] Paul Lalonde “Innovating in a Software Graphics Pipeline” Beyond Programmable Shading

Course – Siggraph 2009 http://s09.idav.ucdavis.edu/• [Reshetov09] Alexander Reshetov, ”Morphological Antialiasing”• [Yang09] Jason C. Yang et al, High Performance Graphics 2009, ”A Directionally Adaptive Edge Anti-

Aliasing Filter”• [McGuire10] Morgan McGuire, High Performance Graphics 2010, ”Ambient Occlusion Volumes”

http://graphics.cs.williams.edu/papers/AOVHPG10/• [Biri10] Venceslas Biri et al, Siggraph 2010, “Practical morphological antialiasing on the GPU”• [Bavoil08] Louis Bavoil & Miguel Sainz, Siggraph 2008 “Image-Space Horizon-Based Ambient Occlusion”

http://developer.nvidia.com/object/siggraph-2008-HBAO.html• [Hill10] Stephen Hill, Game Developers Conference 2010 ”Rendering with Conviction”• [Kihl10] Robert Kihl, Advanced in Real-time Rendering in 3D Graphics and Games, Siggraph 2010,

”Destruction Masking in Frostbite 2 using Volume Distance Fields” http://publications.dice.se• [Sugerman09] Jeremy Sugerman et al - ACM Transactions on Graphics January, 2009 ”GRAMPS: A

Programming Model for Graphics Pipelines” http://graphics.stanford.edu/papers/gramps-tog/• [Perthuis10] Cedric Perthuis, ”MLAA in God of War 3” (PS3 registered developers only)

Beyond Programmable Shading, SIGGRAPH 2010 3904/11/23