cs-378: game technology lecture #9: more mapping prof. okan arikan university of texas, austin...
TRANSCRIPT
CS-378: Game Technology
Lecture #9: More Mapping
Prof. Okan ArikanUniversity of Texas, Austin
Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica HodginsV2005-08-1.1
Today
Shaders
Nvidia Quadro FX 4500
Shadow Buffer Algorithms
Compute z-buffer from light viewpoint
Put it into the shadow buffer
Render normal view, compare world locations of points in the z-buffer and shadow buffer
Have to transform pts into same coordinate system
Problems:
Resolution is a big issue – both depth and spatial
Only some hardware supports the required computations
But, the Cg Tutorial book gives a fragment shader to do it
Programmable HardwareEarliest hardware was fixed-function – no control over processing
Then came configurable hardware – fixed function units, but you had some control over how information flowed
Stencil buffer and tests are an example
Multi-texturing is another example
Most recent hardware is programmable
Just like a CPU is programmable, but not quite
Nvidia GeForce FX, ATI 9700
Modified PipelineReplace transform and lighting with vertex shader
Vertex shader must now do transform and lighting
But can also do more
Replace texture stages with fragment (pixel) shader
Previously, texture stages were only per-pixel operations
Fragment shader must do texturing
Vertex Shader MotivationOld graphics hardware did all the work on the CPU – the “graphics card” was a color buffer and DA converter
Then came hardware rasterizers
Knew how to draw polygons on the screen, and maybe interpolate
Later came texture access in hardware (we’re in the mid-90s)
Important: Per-vertex transformations and lighting (t&l) were on the CPU
Then hardware t&l came to the commodity market
Had been present in $20,000+ machines for a few years, now cost $300
But the functionality (the types of transforms and lighting equations) was fixed in the hardware
Vertex Shaders
To shift more processing to the hardware, general programmability was required
The tasks that come before transformation vary widely
Putting every possible lighting equation in hardware is impractical
A vertex program runs on a vertex shader
Vertex programs can modify the vertex between submission to the pipeline and primitive assembly
Why bother? Why not leave it all on the CPU?
Vertex Program Properties
Run for every vertex, independently
Access to all per-vertex properties
Some registers - NOT retained from one vertex to the next
Some constant memory
Programmer specifies what’s in that memory
Compute on the available data
Output to fixed registers – the next stage of the pipeline
Figure 2: The inputs and outputs of vertex shaders. Arrows indicate read-only, write-only, or read-write.
IO for Vertex Shaders (Circa 2001)
Vertex Programs
All operations work on vectors
Scalars are stored as vectors with the same value in each coordinate
Instruction set varies:
Numerical operations: add, multiply, reciprocal square root, dot product, …
LIT which implements the Phong lighting model in one instruction
Can re-arrange (swizzle) and negate vectors before doing op
Matrices can be automatically mapped into registers
No branches in some hardware, but can be done with other instructions
Set a value to 0/1 based on a comparison, then multiply and add
Vertex Program Example
# blend normal and position v=v1+(1-)v2 MOV R3, v[3] ; MOV R5, v[2] ; ADD R8, v[1], -R3 ; ADD R6, v[0], -R5 ; MAD R8, v[15].x, R8, R3 MAD R6, v[15].x, R6, R5 ;
# transform normal to eye space DP3 R9.x, R8, c[12] ; DP3 R9.y, R8, c[13] ; DP3 R9.z, R8, c[14] ;
# transform position and output DP4 [HPOS].x, R6, c[4] ; DP4 [HPOS].y, R6, c[5] ; DP4 [HPOS].z, R6, c[6] ; DP4 [HPOS].w, R6, c[7] ;
# normalize normal DP3 R9.w, R9, R9 ; RSQ R9.w, R9.w ; MUL R9, R9.w, R9 ;
# apply lighting and output color DP3 R0.x, R9, c[20] ; DP3 R0.y, R9, c[22] ; MOV R0.zw, c[21] ; LIT R1, R0 ; DP3 o[COL0], c[21], R1 ;
Fragment Shader Motivation
The idea of per-fragment shaders have been around for a long time
Renderman is the best example, but not at all real time
In a traditional pipeline, the only major per-pixel operation is texture mapping
All lighting, etc. is done in the vertex processing, before primitive assembly and rasterization
In fact, a fragment is only screen position, color, and tex-coords
Fragment Shader Generic Structure
Fragment ShadersFragment shaders operate on fragments in place of the texturing hardware
After rasterization, before any fragment tests or blending
Input: The fragment, with screen position, depth, color, and a set of texture coordinates
Access to textures and some constant data and registers
Compute RGBA values for the fragment, and depth
Can also kill a fragment
Two types of fragment shaders: register combiners (GeForce4) and fully programmable (GeForceFX, Radeon 9700)
FunctionalityAt a minimum, we want to be able to do Phong interpolation
How do you get normal vector info?
How do you get the light?
How do you get the specular color?
How do you get the world position?
Is a fragment shader much good without a vertex shader?
Can you simulate a pixel shader in the CPU?
Fragment programs, like vertex programs, are hard to write in assembler
Shading Languages
Programming shading hardware is still a difficult process
Akin to writing assembly language programs
Shading languages and accompanying compilers allow users to write shaders in high level languages
Two examples: Microsoft’s HLSL (part of DirectX 9) and Nvidia’s Cg (compatible with HLSL)
Renderman is the ultimate example, but it’s not real time
Cg
I’m not going to tell you much about it – pick up the tutorial book and learn about it yourselves
It looks like C or C++
Actually a language and a runtime environment
Can compile ahead of time, or compile on the fly
Why compile on the fly?
What it can do is tightly tied to the hardware
How does it know which hardware, and how to use it?
Vertex Program Example
Pixel Program Example
Cg Runtime
There is a sequence of commands to get your Cg program onto the hardware
See the Cg Tutorial for more details (Appendix B)
Other Things to Try
Many ways of doing bump mapping
Shadow volume construction with vertex shaders
Key observation: degenerate primitives are not rendered
Animation skinning in hardware (deformations)
General purpose computations on matrices, such as fluid dynamics
The Nvidia web site has lots of examples of different effects, as does the Cg tutorial book
At the end of the day !!!
What do we do now
Guest Lecture
March 21st. Bill Mark
www.opengl.orgQ & A
You MUST have questions