computer graphics 3 lecture 4: gpu programming

Download Computer Graphics 3 Lecture 4: GPU Programming

Post on 11-Jan-2016




5 download

Embed Size (px)


Computer Graphics 3 Lecture 4: GPU Programming. Dr. Benjamin Mora. University of Wales Swansea. 1. Benjamin Mora. Content. Introduction. Vertex and Fragment Programs. Programming the GPU. Assembly Code. High Level Languages. Example of applications. Conclusion. - PowerPoint PPT Presentation


  • Computer Graphics 3Lecture 4:GPU Programming

    Benjamin Mora*University of Wales Swansea Dr. Benjamin Mora

  • ContentIntroduction.Vertex and Fragment Programs.Programming the GPU.Assembly Code.High Level Languages.Example of applications.Conclusion.*Benjamin MoraUniversity of Wales Swansea

  • Introduction*Benjamin MoraUniversity of Wales Swansea

  • IntroductionOpenGL (SGI) early oriented the design of current graphics processors (GPUs).Fixed pipeline.Once the different tests are passed, the fragment color is replaced by the new (textured & interpolated) one.Not realistic enough.The graphics pipeline is fed with Primitives like Triangles, Points, etc that are rasterized.Two main stages:Vertex processing.Fragment (rasterized pixel) processing.These 2 stages have been extended for more realism. *Benjamin MoraUniversity of Wales Swansea

  • IntroductionLatest evolutionsUnified shaders.Automatic graphical units balancing between vertex and fragment programs.The lower the image size is, the more cpu and vertex bound the program is.The greater the image-size is, the more fragment/pixel bound the program is.Anti-aliasing and texture filtering parameters also contribute to this.Geometry shaders discussed separately. *Benjamin MoraUniversity of Wales Swansea

  • Vertex and Fragments Programs*Benjamin MoraUniversity of Wales Swansea

  • Vertex and Fragment Programs*Benjamin MoraUniversity of Wales Swansea Daniel Weiskopf, Basics of GPU-Based Programming,

  • Vertex and Fragment Programs*Benjamin MoraUniversity of Wales Swansea Tests (z, stencil) Vertices

  • Programming the GPU*Benjamin MoraUniversity of Wales Swansea

  • Programming the GPULow Level languages (Pseudo-assembler).Help to understand what is possible on the GPU.Large code is a pain to maintain/optimize.May be specific to the graphics card generation/supplier.High Level languages.Easier to write.Early compilers were not very good.Code may be more compatible.Loops.

    *Benjamin MoraUniversity of Wales Swansea

  • Current Low Level Languages (APIs)DirectX 9.Vertex shader 2.0.Pixel shader 2.0.OpenGL extensions.GL_ARB_vertex_program.GL_ARB_fragment_program. Vendor APIsNVidia vertex and fragment program.*Benjamin MoraUniversity of Wales Swansea

  • Current High Level Languages (APIs)Microsoft, ATI.High Level Shading Language (HLSL).


    OpenGL Shading Language.*Benjamin MoraUniversity of Wales Swansea

  • How to use them?Assembly programs:Can be loaded (and compiled) at run-time (OpenGL).Several programs can be loaded at once.Applying the suitable rendering style (i.e. program) to every scene primitive.Avoid latency due to pseudo-assembly compilation. High level Programs:Must be compiled before run-time.The resulting (pseudo) assembly code can then be used. *Benjamin MoraUniversity of Wales Swansea

  • Vertex ProgramsVertex Program.Bypass the T&L unit.GPU instruction set to perform all vertex math.Input: arbitrary vertex attributes.Output: a transformed vertex attributes.homogeneous clip space position (required).colors (front/back, primary/secondary).fog coord.texture coordinates.Point size.

    *Benjamin MoraUniversity of Wales Swansea

  • Vertex ProgramsCustomized computation of vertex attributesComputation of anything that can be interpolated linearly between vertices.

    Limitations:Vertices can neither be generated nor destroyed.Geometry shader for that.No information about topology or ordering of vertices is available.*Benjamin MoraUniversity of Wales Swansea

  • Vertex ProgramsVertex programs bypass the following OpenGL functionalities:Vertex transformations.The modelview and projection matrix transformations.Normal transformations and normalizations.Color material.Per-vertex lighting.Texture coordinate generation.Texture matrix transformations.Raster position transformation.Client-defined clip planes.Per-vertex processing in EXT_point_parameters.Per-vertex processing in NV_fog_distance.Per-vertex point size computations.

    *Benjamin MoraUniversity of Wales Swansea

  • Vertex ProgramsWhat is not replaced?The view frustum clip.Perspective divide (division by w).The viewport transformation.The depth range transformation.Clamping the primary and secondary color to [0,1].Primitive assembly and per-fragment operations.Evaluator (except the AUTO_NORMAL normalization).*Benjamin MoraUniversity of Wales Swansea

  • NV Vertex ProgramsDifferent Versions: 1.0,1.1, 2.0, 3.0.

    Version 1.0:12 temporary vectorial registers (xyzw): R0 => R11.96 Read-Only vectorial registers (xyzw).Specified outside of glBegin/glEnd.8 Matrices.17 Different Vertex Programs instructions. (128 instruction Max. inside the program.)27 in shader 3.0 model.

    *Benjamin MoraUniversity of Wales Swansea

  • NV Vertex ProgramsInput Parameters for the vertices (v[]):

    MnemonicNumberTypical MeaningOPOS 0 object positionWGHT 1 vertex weightNRML2 normalCOL0 3 primary colorCOL1 4 secondary colorFOGC 5 fog coordinateTEX0 8 texture coordinate 0TEX1 9 texture coordinate 1TEX2 10 texture coordinate 2TEX3 11 texture coordinate 3TEX4 12 texture coordinate 4TEX5 13 texture coordinate 5TEX6 14 texture coordinate 6TEX7 15 texture coordinate 7*Benjamin MoraUniversity of Wales Swansea

  • NV Vertex ProgramsNew Output Values for the vertices (o[]):

    MnemonicTypical MeaningHPOS Homogeneous clip space position (x,y,z,w)COL0 Primary color (front-facing) (r,g,b,a)COL1 Secondary color (front-facing) (r,g,b,a)BFC0 Back-facing primary color (r,g,b,a)BFC1 Back-facing secondary color (r,g,b,a)FOGC Fog coordinate (f,*,*,*)PSIZ Point size (p,*,*,*)TEX0 Texture coordinate set 0 (s,t,r,q)TEX1 Texture coordinate set 1 (s,t,r,q)TEX2 Texture coordinate set 2 (s,t,r,q)TEX3 Texture coordinate set 3 (s,t,r,q)TEX4 Texture coordinate set 4 (s,t,r,q)TEX5 Texture coordinate set 5 (s,t,r,q)TEX6 Texture coordinate set 6 (s,t,r,q)TEX7 Texture coordinate set 7 (s,t,r,q)

    *Benjamin MoraUniversity of Wales Swansea

  • NV Vertex ProgramsVertex Program Instructions: OpCodeInputs Output Operation(scalar or vector) (vector or replicated scalar)ARL s address register address register loadMOV v v moveMUL v,v v multiplyADD v,v v addMAD v,v,v v multiply and addRCP s ssss reciprocalRSQ s ssss reciprocal square rootDP3 v,v ssss 3-component dot productDP4 v,v ssss 4-component dot productDST v,v v distance vectorMIN v,v v minimumMAX v,v v maximumSLT v,v v set on less thanSGE v,v v set on greater equal thanEXP s v (ssss?)exponential base 2LOG s v (ssss?) logarithm base 2LIT v v light coefficients*Benjamin MoraUniversity of Wales Swansea

  • NV Vertex ProgramsSpecial Instruction Manipulation: Use of Negated Values:MOV R0,-R1;ADD R0,R1,-R2; # R0
  • NV Vertex ProgramsExample: Normal Normalization.

    # v[NRML] = (nx,ny,nz)## = normalize(v[NRML])# R0.w = 1/sqrt(nx*nx + ny*ny + nz*nz)#!!VP1.0MOV R1, v[NRML] ;DP3 R0.w, R1, R1;RSQ R0.w, R0.w;MUL, R1, R0.wwww;# Then use R0 to compute shading...MOV o[COL0],...

    *Benjamin MoraUniversity of Wales Swansea

  • NV Vertex Programs#simple specular and diffuse lighting computation with an eye-space normal?!!VP1.0## c[0-3] = modelview projection (composite) matrix# c[4-7] = modelview inverse transpose# c[32] = normalized eye-space light direction (infinite light)# c[33] = normalized constant eye-space half-angle vector (infinite viewer)# c[35].x = pre-multiplied monochromatic diffuse light color & diffuse material# c[35].y = pre-multiplied monochromatic ambient light color & diffuse material# c[36] = specular color# c[38].x = specular power## outputs homogenous position and color#DP4 o[HPOS].x, c[0], v[OPOS];DP4 o[HPOS].y, c[1], v[OPOS];DP4 o[HPOS].z, c[2], v[OPOS];DP4 o[HPOS].w, c[3], v[OPOS];DP3 R0.x, c[4], v[NRML];DP3 R0.y, c[5], v[NRML];DP3 R0.z, c[6], v[NRML]; # R0 = n' = transformed normalDP3 R1.x, c[32], R0; # R1.x = Lpos DOT n'DP3 R1.y, c[33], R0; # R1.y = hHat DOT n'MOV R1.w, c[38].x; # R1.w = specular powerLIT R2, R1; # Compute lighting valuesMAD R3, c[35].x, R2.y, c[35].y; # diffuse + emissiveMAD o[COL0].xyz, c[36], R2.z, R3; # + specularEND*Benjamin MoraUniversity of Wales Swansea

  • NV Fragment ProgramsSimilar to the Vertex Programs.Same way to load programs.Inputs and Outputs are differents. Different Set of instructions.More instructions, but tend to be the sameVersions available: 1.0, 2.0, and 4.0.64 constant vector registers.32 32-bit floating point precision registers or 64 16-bit floating point precision registers.

    *Benjamin MoraUniversity of Wales Swansea

  • NV Fragment ProgramsFragment Program Inputs

    Register NameDescriptionf[WPOS] Position of the fragment center. (x,y,z,1/w)f[COL0] Interpolated primary color (r,g,b,a)f[COL1] Interpolated secondary color (r,g,b,a)f[FOGC] Interpolated fog distance/coord (z,0,0,0)f[TEX0] Texture coordinate (unit 0) (s,t,r,q)f[TEX1] Texture coordinate (unit 1) (s,t,r,q)f[TEX2] Texture coordinate (unit 2) (s,t,r,q)f[TEX3] Texture coordinate (unit 3) (s,t,r,q)f[TEX4] Texture coordinate (unit 4) (s,t,r,q)f[TEX5] Texture coordinate (unit 5) (s,t,r,q)f[TEX6] Texture coordinate (unit 6) (s,t,r,q)f[TEX7] Texture coordinate (unit 7) (s,t,r,q)

    *Benjamin MoraUniversity of Wales Swansea

  • NV Fragment ProgramsFragment Program Outputs

    Register NameDescription

    o[COLR] Final RGBA fragment color, fp32 format (color programs)o[COLH] Final RGBA fragment color, fp16 format (color programs)o[DEPR] Final fragment depth value, fp32 format

    o[TEX0] TEXTURE0 output, fp16 format (combiner programs)o[TEX1] TEXTURE1 output, fp16 format (combiner programs)o[TEX2] TEXTURE2 ou


View more >