Download - Introduction to Programmable Hardware
Introduction to Programmable Hardware
Traditional Graphics Pipeline
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
(per vertex operations)
(per primitive operation)
(per fragment operation)
Programmable features
• Vertex Programming • Pixel Shader
– Texture shader – Register combiner
• Based on nVIDIA architecture
Vertex Program (cont’d)
Vertex Programming offers programmable T&L unit
User-definedVertex
Processing
User-definedVertex
Processing
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
Gives the programmer total control of vertex processing.
Vertex Program (cont’d)
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
VertexProgramVertex
Program
Vertex Program (cont’d)
Vertex Program– Assembly language interface to T&L unit– GPU instruction set to perform all vertex math– Reads an untransformed, unlit vertex– Creates a transformed vertex– Optionally creates
• Lights a vertex• Creates texture coordinates• Creates fog coordinates• Creates point sizes
Create Vertex Program
Programs (assembly) are defined inline as Programs (assembly) are defined inline as
character stringscharacter strings
static const GLubyte vpgm[] = “\!!VP1. 0\ DP4 o[HPOS].x, c[0], v[0]; \ DP4 o[HPOS].y, c[1], v[0]; \ DP4 o[HPOS].z, c[2], v[0]; \ DP4 o[HPOS].w, c[3], v[0]; \ MOV o[COL0],v[3]; \END";
Vertex Source
Vertex Program
Vertex Output
Program Constants
Temporary Registers
16x4 registers
128 instructions
96x4 registers
12x4 registers
15x4 registers
Programming ModelProgramming Model
V[0] …V[15] c[0]
…c[96]
R0 …R11
O[HPOS]O[COL0]O[COL1]O[FOGP]O[PSIZ]O[TEX0] …O[TEX7]
All quad floats
Instruction Set: The ops
• 17 instructions total
• MOV, MUL, ADD, MAD, DST
• DP3, DP4
• MIN, MAX, SLT, SGE
• RCP, RSQ, LOG, EXP, LIT
• ARL
Pixel Shader
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
User-defined perpixel shading
Texture Mapping/Blending
Traditional OpenGL texture mapping/blending
Texture Coordinate Texture Unit
Fragment color
Blend colors
Fragment color output
Vertex colors Gouraud Shading
Multitexturing
An optional extension of OpenGL 1.2
texture unit 0 blend colors
texture unit 0 blend colors
texture unit 0 blend colors
texture unit 0 blend colors
fragment color input
fragment color output
Texture Compositing
OpenGL 1.2
TextureEnvironment
0 TextureEnvironment
1
TextureEnvironment
1
TextureFetching
SpecularColorSum
SpecularColorSum Fog
Application
Tex0
Tex1
Fragment Color
Fog Color/Factor
Specular Color
Choice of 5 set functions for RGB and Alpha:
Ct: texture color; At: texture alpha Cf: incoming fragment color; Af: incoming fragment alpha Cc: color assigned to GL_TEXTURE_ENV_COLOR
Post-environment specular color addition and fog application
Function RGB Alpha
Replace Ct At
Modulate Cf Ct Af At
Decal Cf (1 – At) + Ct At Af
Blend Cf (1 – Ct) + Cc Ct Af At
Add Cf + Ct Af At
Compositing Operator
Pixel Shader (cont’d)
Texture shader– 4 texture units– 23 different texture shader operations
• Conventional (1D, 2D, 3D, texture rectangle, cube map)• Special case (none, pass through, cull fragment)• Dependent texture fetches (result of one texture lookup
affects texture coords for subsequent unit)• Dependent textures fetches with dot product (and optional
reflection) calculations
Register combiners– 8 stages (general combiners) on GeForce3/4– Per-stage constants
Based on nVIDIA’s GF3/4 architecture
Pixel Shader Based on nVIDIA’s GF3/4 architecture Texture shader + register combiner
texture unit 0 texture program
texture unit 1 texture program
texture unit 2 texture program
texture unit 3 texture program
fragment color input
register combinerfragment color output
texture shader
Texture Shader
TexTex##
Texture Coords(S,T,R,Q)
ShaderOperations
TextureFetch
Texture 2D
2DAny Format
( , )
Bound Texture Target/Format
Qi
Si
Qi
Ti(Si,Ti,Ri,Qi)ii
OutputColor
(R,G,B,A)
Texture program example: conventional 2D texture
Texture Shader (cont’d)
Texture program example: pass through
ShaderOperations
TextureFetch
OutputColor
(Si,Ti,Ri,Qi) None
Bound Texture Target/Format
(R,G,B,A)None
R = Clamp0to1(Si)
G = Clamp0to1(Ti)
B = Clamp0to1(Ri)
A = Clamp0to1(Qi)
Texture Shader (cont’d)
00
TexTex##
Texture Coords(S,T,R,Q)
ShaderOperations
TextureFetch
App specific Texture specific
Any type Unsigned RGB[A]
R0G0B0A0
11 (A0,R0)
Texture specific
Bound Texture Target/Format
R1G1B1A1Ignored None
2D RGBA
Texture program example: dependent texture
Register Combiner
GeForce 2 (only 2 general combiner stages)
Spare 0
Fragment Color
TextureFetching
GeneralCombiner
0
4 RGB Inputs
Texture 0
Texture 1
Fog Color/Factor
Reg
iste
r S
etR
egis
ter
Set
6 RGB Inputs
Specular Color
4 Alpha Inputs
3 RGB Outputs
3 Alpha Outputs
GeneralCombiner
1
4 RGB Inputs
4 Alpha Inputs
3 RGB Outputs
3 Alpha Outputs
FinalCombiner
1 Alpha Input
Specular Color
Register-based programming– All textures and colors available for each and every
texture blending stage– 8 Stages of blending in hardware, plus specular and
fog• Note that GeForce3 has 8 combiners, and 4
textures.– Signed color arithmetic
Register Combiner (cont’d)
Diagram of a General Combiner
Input RGB, Alpha Registers
Input Alpha, Blue Registers
InputMappings
InputMappings
A
B
C
D
A op1 B
C op2 D
AB op3 CD
RGB Function
A
B
C
D
AB
CD
AB op4 CD
Alpha Function
RGBScale/Bias
AlphaScale/Bias
Next Combiner’sRGB Registers
Next Combiner’sAlpha Registers
RGBPortion
AlphaPortion
General Combiner Input Registers
Input RGB, Alpha Registers
Input Alpha, Blue Registers
InputMappings
InputMappings
A
B
C
D
A op1 B
C op2 D
AB op3 CD
RGB Function
A
B
C
D
AB
CD
AB op4 CD
Alpha Function
RGBScale/Bias
AlphaScale/Bias
Next Combiner’sRGB Registers
Next Combiner’sAlpha Registers
RGBPortion
AlphaPortion
The Register Set
Primary (diffuse) color• initialized to RGBA of fragment’s primary color
Secondary (specular) color• initialized to RGB of fragment’s secondary/specular color• alpha not initialized
Texture 0 and Texture 1 colors• initialized to fragment’s filtered RGBA texel from numbered texture unit• not initialized if numbered texture unit is disabled or non-existent
Spare 0 and Spare 1• Alpha of Spare 0 is initialized to alpha of Texture 0 color (if enabled)• RGB of Spare 0 and all of Spare 1 is not initialized
Fog• RGB is current fog color• alpha is fragment’s fog factor (only available in final combiner)• read-only
Constant color 0 and Constant color 1• initialized to user-defined RGBA value• read-only
Zero• constant, read-only value of zero
General Combiner Input Mappings
Input RGB, Alpha Registers
Input Alpha, Blue Registers
InputMappings
InputMappings
A
B
C
D
A op1 B
C op2 D
AB op3 CD
RGB Function
A
B
C
D
AB
CD
AB op4 CD
Alpha Function
RGBScale/Bias
AlphaScale/Bias
Next Combiner’sRGB Registers
Next Combiner’sAlpha Registers
RGBPortion
AlphaPortion
General Combiner Input Mappings
Signed Identity
f(x) = x
[-1, 1] [-1, 1]
Unsigned Identity
f(x) = max(0, x)
[0, 1] [0, 1]
Expand Normal
f(x) = 2 * max(0, x) - 1
[0, 1] [-1, 1]
Half Bias Normal
f(x) = max(0, x) – ½
[0, 1] [-½, ½]
Signed Negate
f(x) = -x
[-1, 1] [1, -1]
Unsigned Invert
f(x) = 1-min(max(0,x),1)
[0, 1] [1, 0]
Expand Negate
f(x) = -2 * max(0, x) + 1
[0, 1] [1, -1]
Half Bias Negate
f(x) = -max(0, x) + ½
[0, 1] [½, -½]
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
-1
0
1
-1 0 1
General Combiner RGB Function
Input RGB, Alpha Registers
Input Alpha, Blue Registers
InputMappings
InputMappings
A
B
C
D
A op1 B
C op2 D
AB op3 CD
RGB Function
A
B
C
D
AB
CD
AB op4 CD
Alpha Function
RGBScale/Bias
AlphaScale/Bias
Next Combiner’sRGB Registers
Next Combiner’sAlpha Registers
RGBPortion
AlphaPortion
General Combiner RGB Functions
A
B
C
D
AB
CD
AB + CD
A
B
C
D
AB
CD
mux(AB, CD)
A
B
C
D
A • B
C • D
Dot / Dot / Discard
A
B
C
D
A • B
CD
A
B
C
D
AB
C • D
Dot / Mult / Discard Mult / Dot / Discard
Mult / Mult / SumMult / Mult / Mux
mux(AB, CD) = (Spare0[Alpha] ½) ? AB : CD
Dot products on RGB registers: A • B = (A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue],
A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue],A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue])
Multiplication on RGB registers: AB = (A[red] * B[red], A[green] * B[green], A[blue] * B[blue])
Diagram of the Final Combiner (OpenGL only)
Input RGB, Alpha Registers
Input Alpha, Blue Registers
InputMappings
InputMapping
A
B
C
D
AB + (1-A)C + D
RGB Function
RGBPortion
AlphaPortion
RGBOut
AvailableRGB Inputs
AlphaOut
E
F
EF
Spare0
2nd-aryColor
Sum
Clamp to [0, 1]
InputMappings
Color Sum Unit
Multiplier