REAL-TIME SMOOTH SURFACE CONSTRUCTION ON THE GRAPHICS PROCESSINGUNIT
By
TIANYUN NI
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2008
1
c© 2008 Tianyun Ni
2
To my family, especially my father and to all of whom have lentencouragement and support
during the time spent on this research
3
ACKNOWLEDGMENTS
I wish to express my sincerest thanks to the chair of my dissertation committee, Dr. Jorg,
Peters, for working with me throughout this long enterprise.
4
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
LIST OF FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
CHAPTER
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
1.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111.2 Problem Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131.3 Modern GPU Pipeline and Current Trends. . . . . . . . . . . . . . . . . . . . . 141.4 Representations in Surface Modeling. . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1 Subdivision Surfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . .171.4.2 Parametric Patches. . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
1.4.2.1 Bezier technique. . . . . . . . . . . . . . . . . . . . . . . . . 221.4.2.2 Related work. . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2 A NEW SCHEME FOR SURFACE CONSTRUCTION. . . . . . . . . . . . . . . . . 25
2.1 Contribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .252.2 The Conversion Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
2.2.1 The Conversion Rules for a Type-1 Quad. . . . . . . . . . . . . . . . . . 272.2.2 The Conversion Rules for a Type-2, or Type-3 Quad. . . . . . . . . . . . 29
2.3 Derivation of the coefficients of a c-patch. . . . . . . . . . . . . . . . . . . . . 302.3.1 Derivation ofλ0 andλ1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.2 Derivation ofb211 andb121 . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.3 Derivation ofb112 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
2.4 Smoothness Verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .352.5 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
2.5.1 Number of Patches. . . . . . . . . . . . . . . . . . . . . . . . . . . . .392.5.2 Cost of Patch Construction. . . . . . . . . . . . . . . . . . . . . . . . . 392.5.3 Cost of Surface Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6 Approximation Catmull-Clark Subdivision Surface. . . . . . . . . . . . . . . . 402.7 Water-Tight Surface Verification. . . . . . . . . . . . . . . . . . . . . . . . . . 402.8 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .40
3 GPU IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .423.2 2-pass Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
5
3.3 1-pass Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .443.4 Coordinate System Transformation. . . . . . . . . . . . . . . . . . . . . . . . . 443.5 Water-Tight Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .463.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
4 RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.1 Shape Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .484.2 Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .504.3 Displacement Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .514.4 Morphing and Animation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .524.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
5 PATCH CONVERSIONS FOR MESHES WITH TRI/QUAD/PENT FACETS. . . . . 56
6 DISCUSSION AND FUTURE WORK. . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1 Future GPU API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .596.2 Volume Preservation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .596.3 Adaptive Tessellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
6
LIST OF TABLES
Table page
4-1 ALU operations for evaluation at(u, v) . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4-2 Performance results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50
4-3 Performance of the 1-pass implementation.. . . . . . . . . . . . . . . . . . . . . . . 51
7
LIST OF FIGURES
Figure page
1-1 Polygonal modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
1-2 Problem statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
1-3 DirectX 10 pipeline stages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
1-4 DirectX 10 pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
1-5 The primitives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
1-6 The notations of input mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
1-7 The three possible configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
1-8 The Catmull-Clark stencils. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
1-9 The subdivision schemes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
1-10 The suggested rendering passes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
1-11 Future GPU architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22
1-12 The subdivision schemes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
2-1 Derivation of c-patch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
2-2 Vertex computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
2-3 Surface conversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
2-4 Computing control pointsv, e, f andt, the projection ofe . . . . . . . . . . . . . . . . 27
2-5 Patch-based computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28
2-6 Patch computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
2-7 The re-parameterization ofλ to meetG1 at the vertex . . . . . . . . . . . . . . . . . . 32
2-8 Coefficientsb211 andb121 of c-patch is derived on top of a ghost patch.. . . . . . . . . 32
2-9 The choice of middle point in c-patch. . . . . . . . . . . . . . . . . . . . . . . . . . 34
2-10 The center of a bi-cubic patch can be evaluated by the linear combination of the bound-ary coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
2-11 C1 transition between a triangular and a bicubic patch.. . . . . . . . . . . . . . . . . 37
2-12 G1 transition between two triangular patches.. . . . . . . . . . . . . . . . . . . . . . 38
8
3-1 2-Pass implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
3-2 2-Pass conversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
3-3 1-Pass conversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
3-4 1-Pass implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
3-5 (u, v) on an irregular quad.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
3-6 Water-tight Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
4-1 Shape quality comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4-2 Catmull-Clark approximation comparison. . . . . . . . . . . . . . . . . . . . . . . . 49
4-3 Ordinary patches and extraordinary patches. . . . . . . . . . . . . . . . . . . . . . . 49
4-4 GPU smoothed quad surfaces with displacement mapping.. . . . . . . . . . . . . . . 49
4-5 Close-up of the frog. The refined mesh is water-tight.. . . . . . . . . . . . . . . . . . 51
4-6 Displacement mapping on the frog model. . . . . . . . . . . . . . . . . . . . . . . . 52
4-7 Shape comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
4-8 Shape comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
4-9 Real time animation on the Sword model.. . . . . . . . . . . . . . . . . . . . . . . . 54
4-10 Real time animation on the Frog model.. . . . . . . . . . . . . . . . . . . . . . . . . 54
4-11 Asynchronous animation of nine Frogs.. . . . . . . . . . . . . . . . . . . . . . . . . 54
5-1 The reasons for using Tr/Quad/Pent Meshes. . . . . . . . . . . . . . . . . . . . . . . 56
5-2 A quad/tri/pent model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
5-3 Patch representations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
5-4 Triangular representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
9
Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy
REAL-TIME SMOOTH SURFACE CONSTRUCTION ON THE GRAPHICS PROCESSINGUNIT
By
Tianyun Ni
August 2008
Chair: Jorg, PetersMajor: Computer Engineering
Increased realism in interactive graphics and gaming requires complex smooth surfaces
to be rendered at ever higher frame rates. In particular, representations used to model surfaces
offline, such as spline and subdivision surfaces, have to be modified or reorganized to allow
for efficient usage of the graphics processing unit and its SIMD (Single Instruction, Multiple
Data) parallelism. This dissertation presents a novel algorithm for converting quad meshes on
the GPU to smooth, water-tight surfaces at the highest speeddocumented so far. The conversion
reproduces bi-cubic splines wherever possible and closelymimics the shape of the Catmull-Clark
subdivision surface by c-patches where a vertex has a valence different from 4. The smooth
surface is piecewise polynomial and has well-defined normals everywhere.
10
CHAPTER 1INTRODUCTION
This chapter introduces the challenges that motivate the dissertation, gives a detailed
literature review, positions of the research relative to the current state of the art. and an overview
of the modern GPU pipeline.
1.1 Motivation
In graphics, 3D objects are approximated by polyhedral meshes of great complexity. For
example, a game character can consist of tens of thousands ofpolygons (Figure1-1). Increased
realism in interactive gaming demands such meshes to be animated and rendered in real-time.
There are essentially two major approaches in the literature which serve this purpose: Polygonal
Modeling and Higher-order Surface Modeling.
There are two scenarios of animations: Morphing and Skinning. Morphing is used to
change one image into another through a seamless transition. Skinning is a common technique
to deform characters [20, 23, 24, 32]. The animated mesh, referred as a ”skin”, is deformed
based on the pose of an underlying skeleton. In Polygonal Modeling (Figure1-1), skinning
and morphing are applied to a high-detail mesh created by an artist. Most games currently use
this approach. This technique involves redundant work due to minimal sharing in Polygonal
Modeling representation. In addition, a large number of vertices in a complex mesh must be fed
into the graphics pipeline via the GPU’s memory bus, which isa potential bottleneck.
Figure 1-1. Polygonal Modeling: currently the popular animation approach in games.
11
The alternative approach, Surface Modeling, animates a coarse mesh (Figure1-2).
Subdivision surfaces and parametric patches, as two popular high-order surface representations,
both support level of detail rendering (see Section 1.4). Highly-detailed 3D models are produced
by displacement mapping [11]. Displacement mapping adds fine details in form of scalar
fields on the smooth surface defined by the coarse mesh. As a specific instance, Lee [27]
proposes Displaced Subdivision Surface to represent a detailed surface model as a scalar value
displacement over a smooth surface domain. This approach reduces the number of vertices that
must be read and animated in each frame because complex geometric details are generated on
the GPU. The runtime cost now includes the conversion process from the coarse input mesh to
the final complex mesh. The conversion process involves surface construction, evaluation and
displacement mapping.
Figure 1-2. Each high-detail mesh in Surface Modeling is represented by a coarse control meshwith a displacement map. The coarse control mesh is first converted to a smoothsurface. Then the surface is tessellated and the vertices are perturbed in the normaldirections based on the corresponding value in the displacement map. Last, thenormal at each vertex of the refined highly-detailed mesh is updated.
In summary, the advantages of Surface Modeling are
1. lower computation cost of animation because skinning is done on the coarse mesh, not thefinal dense mesh;
2. memory and bandwidth savings by encoding most detail as one-dimensional displacementsrather than three-dimensional vectors;
12
3. support of refinement level on the fly;
4. customization of archetypes: we can model different 3D models with the same coarsemesh, changing only the displacement map;
5. support of adaptive tessellation: evaluation does not haveto be on a uniform grid.
The disadvantages of Surface Modeling is that modern GPUs cannot render such surface directly.
The surface must be converted into triangles or quads through a process of tessellation and
evaluation. Therefore, Surface Modeling becomes more attractive as a real-time technique only if
the conversion is more cheaply than the cost of reading and animating a high-polygon mesh. Our
goal is to design such a scheme on the GPU.
1.2 Problem Statement
Meshes consist of pure quadrilateral facets are common in modeling for animation. Any
polyhedral mesh can be converted into such a quad mesh by one step of mesh refinement. But a
good designer creates meshes with the quad-restriction in mind so that no global refinement is
necessary. We therefore focus on quadrilateral meshes and aim to derive a set of efficient rules
directly on the GPU (Figure1-2, the red dotted rectangle) that produce surfaces with good visual
quality. Specifically the resulting surfaces should
1. generate a small number of low degree polynomials;
2. possess smooth geometry (no extra cost for smooth shading);
3. closely approximate Catmull-Clark surfaces (a standard modeling tool);
4. are water-tight (no pixel drops out);
5. map well to the graphics pipeline and leverage the strengthsof GPU computation.
13
1.3 Modern GPU Pipeline and Current Trends
A graphics processing unit (GPU) is a dedicated graphics rendering device. Its SIMD
architecture has evolved substantially over the last decade. This highly parallel structure makes
it more effective than general-purpose CPUs for a range of algorithms. Modern GPUs expose a
programmable parallel stream processing pipeline as a series of short programs called shaders.
During the last five years, major graphics software libraries such as OpenGL and DirectX are
used to program the GPU via shaders on a programmable pipeline, which has mostly superseded
the older ”fixed-function pipeline”. The two most popular graphics software libraries, DirectX
and OpenGL, currently both specify APIs for three types of shaders: vertex, geometry, and pixel
shader. The shaders in DirectX10 system [4](Figure1-4 ) share a common core that accesses
up to 128 memory buffers and 16 parameter (constant) buffers. Vertex and pixel shaders use a
”one-in, one-out” data processing model. In contrast, the geometry shader has a limited ability to
amplify or reduce primitive count and thus is able to change meshes. Figure1-3 shows the input
Figure 1-3. The input and output of each pipeline stage in DirectX 10 system
and output of each pipeline stage. The more detailed explanation of each stage is as follows:
14
Figure 1-4. DirectX 10 Pipeline
1. The Input Assembler (IA) gathers vertex data to set up vertexand index buffers. Vertexbuffers contain per-vertex data while index buffers define geometry primitives as integerindices into vertex buffers. Indexing helps avoid redundant computations of the samevertex.
2. The vertex shader (VS) typically processes vertex-based operations such as changing theposition and normal of a single vertex. The computations in this stage are local. Eachvertex only has its own information and does not communicatewith other vertices. The VSis most commonly used to transform vertices from object space to clip space.
3. The geometry shader (GS) processes the vertices of a single primitive. A primitive can bea point, a line segment, a triangle, a point with adjacency, aline segment with adjacency,and a triangle with adjacency (Figure1-5). Due to the availability of the primitive verticesup to 6 vertices for a triangle with adjacency), the computations in the stage are lesslocal than those on the VS and PS. The GS can emit additional primitives. This newamplification feature, introduced in DirectX10, adds more flexibility and makes a numberof algorithms [1] possible to be implemented on the GPU, such as mesh refinement,shadow volumes, dynamic particle systems, etc. The geometry shader output may be fed tothe rasterizer stage and/or to a vertex buffer in memory via the stream output stage.
15
4. The rasterizer (TR) is a fixed-function stage generating fragments by filling in the poly-gons sent through the graphics pipeline. Clipping, culling, perspective divide, viewporttransform, primitive set-up, scissoring, depth offset also happen in the stage.
5. The pixel shader (PS) operates on one fragment at a time. Usually scene lighting andpixel-related effects such as bump mapping and color tone mapping occur in the PS.
6. The output merger (OM) takes a fragment from PS and performs traditional stencil anddepth testing operations as well as render target blending to generate a final pixel on thescreen.
Figure 1-5. The six primitives used inGS
The future GPU pipeline [29, 48] is expected to provide a Tessellation Unit, combined with
new shader stages for patch conversion and evaluation of tessellated high-order surfaces. The
Tessllator provides a solution to adaptive refinement on thegraphics hardware. Based on user-
provided tessellation factors per edge, the tessellator adaptively creates a sampling pattern of
the underlying parametric domain and automatically generates a set of parametric domains. In
addition, two special shaders are introduced to the next-generation GPU pipeline. The patch
shader converts an input mesh to a set of patches. The evaluation shader takes the(u, v) output of
the tessellator and evaluates the patch at(u, v). This future GPU architecture also allows the GPU
to exploit more parallelism because multiple arithmetic units can be running the same evaluation
shader. Moreover tessellation occurs on the GPU and overcomes the bottleneck of bus bandwidth
caused by model complexity. The new GPU design indicates Surface Modeling is the trend for
real-time graphics.
16
1.4 Representations in Surface Modeling
In Computer Graphics, surfaces are represented by polyhedral meshes. A polyhedral
mesh is a collection of vertices, edges and facets. The valence of the vertex is the number of its
incident edges. Each facet is an n-sided polygon. In a triangular (or quadrilateral) mesh, n equals
3 (or 4 respectively). An arbitrary mesh has n-sided polygons where the value of n is arbitrary.
The difference between Regular and Irregular Vertices are explained in Figure1-6. Figure1-7
illustrates three possible types of a facet.
Figure 1-6. Tri- and Quadrilateral meshes and facet types 1,2,3.
Figure 1-7. The three possible configurations. Type-1 Quad is regular. Type-2 or 3 is irregular.
Parametric patches and subdivision surfaces are major tools for modeling freeform surfaces
with arbitrary topology. A more intuitive way for inexperienced users to create shape by drawing
curves, or sketch is also available [22, 36]
1.4.1 Subdivision Surfaces
Subdivision surfaces, as part of standard modeling packages (e.g., 3DMax, Maya, Soft-
image, Mirai, Lightwave, etc.), have proven to be a useful modeling tool. Subdivision schemes
were first introduced by [10, 12, 31]. They generate a smooth surface through mesh refinement
17
process. This method begins with a coarse mesh that approximates a 3d model, known as a
control mesh. Each vertex in the control mesh is called a control point. Control points influence
the shape of the limit surface. The mesh is refined after each subdivision step by inserting new
vertices into the mesh, refining existing point positions, and updating the connectivity. The
positions of the new vertices in the mesh are computed by the averaging rules that apply to the
positions of nearby old vertices. The averaging rules are different from scheme to scheme (see
a comparison in Figure1-9), and it is these rules that determine the properties of the surface.
The graphs that illustrates the rules are called stencils. The binary subdivision splits each edge
into 2 while ternary subdivision split each edge into 3. Usually each subdivision scheme has at
most three types of rules: vertex stencil, edge stencil, andface stencil. For example, the stencils
of Catmull-Clark subdivision is shown in Figure1-8. The refinement rules includes stencils
for smooth surface as well as special rules for creating sharped or semi-sharped features. Each
refinement step produces a denser mesh than the previous one.The limit subdivision surface is
the surface produced from this process after infinitely manytimes of refinements. In practical use
however, this algorithm is only applied a limited, and usually four, number of times.
Figure 1-8. The stencils used in Catmull-Clark subdivision. These stencils define the rules toderive the new vertices that lie on the old vertices, edges, and facets.
A realization of tessellation-on-the-fly for Loop subdivision surfaces was proposed in
[33]. Pulli [44] implemented Loop’s subdivision scheme with additions by Hoppe et al [19].
Bischoff [3] proposed a forward-differencing method that only requires a constant amount of
memory regardless of subdivision step. DeRose [13] generalized the infinitely sharp creases
of [19] to obtain semi-sharp creases. Hoppe [19] extended Loop’s scheme by introducing
18
Figure 1-9. Classification of common Subdivision Schemes.
subdivision rules that lead to a piecewise smooth surface with features such as creases, corners,
darts, and conical vertices.
Adaptive subdivision can dramatically speed up the performance because the level of
detail(LOD) is updated based on dynamic distance with the camera as well as the complexity
of each part of the model. Adaptive refinement is previously implemented using quad-tree data
structure [50]. Each level of the tree represents one refinement level of the mesh. However, it
is difficult to map the recursive non-uniform tree structureto parallel computation. Bunnell [9]
provides code for adaptive refinement. Even though this codewas optimized for an earlier
generation GPUs, this implementation adaptively renders the subdivision surfaces in real-time
on current hardware. Lai and Cheng [26] implemented adaptive Catmull-Clark subdivision. A
hardware architecture support for adaptive refinement is proposed by [5]
The implementation of subdivision surfaces on the GPU can beroughly categorized
into three groups: (I) recursive evaluation [9, 13, 28, 44, 46]; (II) direct evaluation [45, 47];
(III) pre-tabulated basis function composition [6, 7]. Recursive evaluation is the most intuitive
way, but not the most efficient approach. Stam [47] directly evaluates subdivision surfaces at
19
arbitrary parameter values. However, Stam’s method can notevaluate a mesh that contains
Type-3 quads. Moreover, the required projection of controlpoints into the eigen space is too
complex for large meshes on the GPU. The weakness of [6, 7, 9, 46] is not able to convert a mesh
with Type-3 quads either. To get rid of those quads usually means applying at least one Catmull-
Clark subdivision step on the CPU and four-fold data transfer to the GPU. In more detail, Shiue
implements recursive Catmull-Clark subdivision using several passes via the pixel shader, using
textures for storage and spiral-enumerated mesh fragmentsfor maximizing parallelism [46]. Bolz
tabulates the subdivision nodal functions up to a given density and linearly combine them in the
GPU [6, 7]. The number of nodal functions equals the number of the vertices of the input mesh.
One of the obvious advantages of subdivision surfaces is they can model surfaces of
arbitrary topological type. Also because of static refinement rule for each scheme subdivision
surfaces are easy to implement. Although subdivision surfaces have been known for nearly
twenty years, their use has been hindered in realtime applications such as games because
recursive refinement is neither memory efficient nor performance efficient. Multiple passes
are required to render a visually smooth surface. Moreover,approximately 4-fold of geometry
increase after each subdivision step causes heavy memory traffic on the bus between the CPU and
the GPU.
1.4.2 Parametric Patches
Since current and impending GPU configurations favor short explicit surface definitions
over recursively defined surfaces, the alternative Patch-based refinement has been advocated for
fast rendering. Parametric patches (short as PP) are rendered directly in terms of their polynomial
representations, as opposed to a collection of approximating facets. Generally speaking, PP
converts control meshes to a set of patches that are parametric piecewise polynomials. PP
schemes can conveniently fit into a 2-pass implementation onthe current graphics pipeline
(Figure1-10). The two rendering passes are combined to one pass in a future GPU pipeline
(Figure1-11) [48].
20
Figure 1-10. The animation, Displacement Mapping(DM) takeplace in VS of the first pass, andsecond pass respectively. The first pass converts the deformed control mesh to itsparametric patch representations. In the following pass, the details are added usingDM after the evaluation of the produced patches from previous pass.
The overall speed of a PP scheme is influenced by both the complexity of patches and the
number of patches. For shape measurements, a desired PP scheme ensures at leastG1 continuity
across the adjacent patches and is a close approximation of subdivision surfaces. One of the
biggest challenge is to ensure the smoothness everywhere over the patches. Peters explained how
to solve the vertex enclosure problem and geometric continuity in [39, 41].
GPU-based evaluation of trimmed NURBs surfaces is proposedin [16, 25]. Peters [40]
used an approximation to the limit surface of Doo-Sabin subdivision to get a quickly convergent
series of approximations to the volume of the enclosed subdivision surface. The difficult problem
of filling n-sided holes is recently solved by [21, 42]. Bajaj et al. [2] introduced A-patches
in tri-variate BB form with few free parameters to adjust theshape both locally and globally.
In [15], the free-form surface is represented in either NURBS formor as cubic triangular
Bezier patches An explicit spline representation of smooth free-form surfaces is to form the
basis of an interactive sculpting environment. In the spirit of the Tessllator, Boubekeur [8]
21
Figure 1-11. One possible pass on the future graphics rendering pipeline,
describes a generic refinement pattern for Surface Modeling(tessellation + displacement) on any
programmable GPU.
1.4.2.1 Bezier technique
The Bezier form is a parametric surface representation andwas first developed in 1972
by the French engineer Pierre Bezier. A comprehensive overview of the Bezier form can be
found in [43]. A Bezier patch is a defined by control points. A Bezier surface, as a set of Bezier
patches, are piecewise polynomials. They are visually intuitive and mathematically convenient
due to the following properties:
1. Affine invariance: Applying an affine transformation to a control mesh applies it to thecorresponding Bezier patch as well.
2. The convex hull property: A Bezier patch lies completely within the convex hull of itscontrol points, and therefore also completely within the bounding box of its control pointsin any given Cartesian coordinate system.
There are two types of Bezier patch:
22
A tensor product patch in Bezier form of degreem by n is defined as:
g(u, v) :=m∑
i=0
n∑
j=0
gij
(m
i
)
ui(1 − u)m−i
(n
j
)
vj(1 − v)n−j.
where(u, v) is a barycentric coordinate on the domain of[0, 1] × [0, 1].
A triangular Bezier patch of degree n is defined as:
b(s, t, w) :=∑
i+j+k=n
i,j,k≥0
bijk
n!
i!j!k!sitjwk.
where(s, t, w) are the barycentric coordinates on a triangle domain.
1.4.2.2 Related work
For quadrilateral input meshes, it is well known that Type-1quads can be converted into
degree 3 by 3 patches in tensor-product Bezier form by the standard B-spline to Bezier conver-
sion rules [14]. Therefore, any two adjacent patches derived from ordinary quads will joinC2.
The interesting aspect is the conversion of Type-2 and Type-3 quads. A number of techniques(see
a comparison in Figure1-12) exist to smooth out quad meshes. Peters [38] generates NURBS
output, that could be rendered, for example by the GPU algorithm of [17]. But this has not been
implemented. The method of [30] generates one bicubic patch per quad following the shape of
Catmull-Clark surfaces. Since these bicubic patches typically do not join smoothly, Loop and
Schaefer compute two additional patches whose cross product approximates the normal of the
bicubic patch. As pointed out in [49], this trompe l’oeil represents a simple solution when true
smoothness is not needed. Comparing the number of operations in construction and evaluation,
the method of [30] should run at comparable speeds to our GPU quad mesh smoothing. Our
method [37] designs a c-patch for converting an irregular quad. The resulting c-patches form a
G1 surface. The alternative algorithm proposed by [35] uses a bi-5 Bezier patch for each irregular
quad.
23
Figure 1-12. This figure compares existing PP schemes in terms of how well they meet theperformance and shape measurements. geom=geometry patches, tan=tangentpatches.
24
CHAPTER 2A NEW SCHEME FOR SURFACE CONSTRUCTION
2.1 Contribution
This thesis proposes a set of rules for converting a quadrilateral mesh to a surface consist-
ing of bi-cubic splines wherever possible. Each irregular quad (Figure1-7) is converted to a novel
C1 surface patch (shortc-patch). The surface closely mimics the shape of the Catmull-Clarksub-
division surface and is constructed entirely by local parallel operations on the GPU. The resulting
surface is piecewise polynomial and has well-defined normals everywhere. The evaluation avoids
pixel dropout.
A c-patch is aC1 piecewise polynomial patch with cubic boundary. It is defined by 24
coefficients whose instantiation for a smooth surface is given in Section xxx below and indicated
in Figure2-1. A c-patch has an alternative representation as four triangular, total degree 4 patches
in Bernstein-Bezier form (Figure2-5 right).
Figure 2-1. The c-patch coefficients. Fori = 0, 1, 2, 3, the boundary coefficientsvi andeij
defined by vertex neighborhoods(figure2-4 specifies the formulas). The interiorcoefficientsbi
211, bi121, bi
112 (figure2-6), wherei = 0..3, j = 0..ni, andni is thevalence ofvi.
2.2 The Conversion Algorithm
Here we give the detailed algorithm for converting the quad mesh into coefficients that
define a smooth surface of low degree. Essentially, the conversion from a mesh to a patch
25
Figure 2-2. Smoothing the vertex neighborhood according toFigure2-4. The center pointp∗, itsdirect neighborsp2j and diagonal neighborsp2j+1 form a vertex neighborhood,j = 0..n − 1.
Figure 2-3. a) A quad neighborhood defining a surface piece. b) A bicubic patch with4 × 4control points. This patch is the output if the quad is regular, and used to determinethe shape of ac-patchc) if the quad is irregular. A c-patch is defined by4 × 6 controlpoints displayed as• and can alternatively, for analysis, be represented as fourC1-connected triangular pieces of degree 4 with degree 3 outerboundaries identicalto the bicubic patch boundaries.
26
consists of computing new points near a vertex using the knowledge of thevertex neighborhood.
A vertex neighborhoodconsists of a mesh pointp∗ and mesh pointspk, k = 0, . . . , 2N − 1 of
all quads surroundingp∗ (Figure2-2). the union of the fourvertex neighborhoodsis a thequad
neighborhood(Figure2-3, A.) that defines a patch. In our scheme, the patch is either a tensor
product bi-cubic Bezier patch, or a c-patch.
2.2.1 The Conversion Rules for a Type-1 Quad
Recall that a quad is Type-1 if all four vertices have 4 neighbors. Type-1 quads are
considered regular in the literature. Such a facet will be converted into a degree 3 by 3 patch in
tensor-product Bezier form by the standard B-spline to Bezier conversion rules [14]. Therefore,
any two adjacent patches derived from Type-1 quads will joinC2. Figure2-3 illustrates the
derivation process from a quad to a Bi-cubic Bezier patch. The conversion rules are shown in
Figure2-4.
Figure 2-4.Computing control pointsv, e, f andt, the projection ofe, at a vertex of valenceN from the meshpointspj of a vertex neighborhood; the subscripts are modulo2N . By default,
σN :=(
cN + 5 +√
(cN + 9)(cN + 1))
/16, the subdominant eigenvalue of Catmull-Clark
subdivision.
A vertexv computed according to Figure2-4 is the limit point of Catmull-Clark sub-
division as explained, for example, in [18]. The rules forej andfj are the standard rules for
converting a uniform bicubic tensor-product B-spline to its Bezier representation. The points
tj are a projection ofej into a common tangent plane (see e.g. [15]). The default scale fac-
tor σ is the subdominant eigenvalue of Catmull-Clark subdivision. We note that forN = 4,
ej+2 = 2v − ej andσ = 1/2 so that the projection leaves the tangent control points invariant as
27
tj = ej :
for N = 4, tj = v +2
4(ej − ej+2) = v + (ej − v) = ej. (2–1)
In the next stage, we combine information from four vertex neighborhoods, as shown in Figure
2-5, to populate a tensor-product patchg of degree 3 by 3 in Bezier form [14]:
g(u, v) :=3∑
k=0
3∑
ℓ=0
gkℓ
(3
k
)
uk(1 − u)3−k
(3
ℓ
)
vℓ(1 − v)3−ℓ.
The patch is defined by its 16 control pointsgkℓ. The formulas of Figure2-4make this patch the
Bezier representation of a bicubic spline in B-spline form. For example, in the notation of Figure
2-5, (gk0)k=0,..3 = (v0, t00, t11, v
1).
Figure 2-5.Patch construction. On the left, four vertex neighborhoodswith verticesvi each contribute one sectorto assemble the4 × 4 coefficients of the Bezier patchg, for exampleg00 = v0, g10 = e0
0, g11 = f0,
g30 = v1, g31 = e1
0(we use superscripts to indicate vertices). On the right, the same four sectors are
used to determine a c-patch if the underlying quad is extraordinary. The indices of the control points ofg andbi are shown.Note that only a subset of the coefficients of the four triangular piecesbi isactually computed to define the c-patch.The full set of coefficients displayed here is only used toanalyze the construction. The indexing of 15 coefficients ofa quartic triangular patch is shown on theright. We use this labeling throughout the dissertation.
28
2.2.2 The Conversion Rules for a Type-2, or Type-3 Quad
Type-2 and Type-3 quads are known as irregular. The irregular quads have at least one and
possibly up to four vertices with valence other than 4. For each irregular quad, the conversion
involves two steps:
1. Apply regular rules defined in Figure2-4 to generatevi andeij shown in Figure2-1 left.
2. Then apply rules in Figure2-6 to yield bi211, b
i121, b
i112 shown in Figure2-1 right.
We use the bicubic patch to outline the shape as we replace it by a c-patch (Figure2-3, c). A
c-patch has the right degrees of freedom to cheaply and locally construct a smooth surface. We
introduce the c-patch in terms of a well-known Bezier form of a polynomial piecebi of total
degree 4 [14]:
bi(u1, u2) :=∑
k+ℓ+m=4k,ℓ,m≥0
bikℓm
4!
k!ℓ!m!uk
1uℓ2(1 − u1 − u2)
m. (2–2)
The c-patch is equivalent to the union of fourbi, i = 0, 1, 2, 3 of total degree 4, but defined by
only 4 × 6 c-coefficients constructed in Figures2-4and2-6:
vi, ti0, ti1, b
i211, b
i121, b
i112, i = 0, 1, 2, 3.
These 24 c-coefficients imply the missing interior control points of the representation (2–2) by
C1 continuity between the triangular pieces: forj = 0, 1, 2, 3 andi = 0, 1, 2, 3,
bi3−j,0,1+j = bi−1
0,3−j,1+j := (bi3−j,1,j + bi−1
1,3−j,j)/2; (2–3)
and the boundary control pointsbikℓ0 are implied by degree-raising [14]:
bi400 := vi, bi
310 := (vi + 3ti0)/4, bi220 := (ti0 + ti+1
1 )/2,
bi130 := (vi+1 + 3ti+1
1 )/4, bi040 := vi+1. (2–4)
For all objects with boundaries, the boundary rules are simply the derivation of cubic Bezier
curves defined by(vi, ti0, ti+11 , vi+1). Basis functions corresponding to the 24 c-coefficients of the
29
Figure 2-6.Formulas for the4 × 3 interior control points that, together with the vertex control pointsvi and thetangent control pointstij , define ac-patch. See also Figures2-11and2-12. Herec
i := cos 2πNi
,
si := sin 2π
Ni
and superscripts are modulo 4. By default,g∗ := (∑
3
i=0vi + 3(ei
0+ ei
1) + 9f i)/64, the
central point of the ordinary patch.
c-patch can be read off by setting one c-coefficient to one andall others to zero and then applying
(2–3) and (2–4).
2.3 Derivation of the coefficients of a c-patch
When a c-patch sector b meets a c-patch sector a (Figure2-12), the following equation
must hold to preserveG1 continuity across the boundary between b and a,
λ(u)∂1b(u, 0) = ∂2b(u, 0) + ∂1a(0, u), (2–5)
where, with· denoting the scalar, respectively three scalar products for the vectors,
λ(u) := (λ0, λ1) · (u, 1 − u)
∂1b(u, 0) := 3(U0, 2U1, U2) · (u2, u(1 − u), (1 − u)2)
∂2b(u, 0) := 4(v0, 3v1, 3v2, v3) · (u3, u2(1 − u), u(1 − u)2, (1 − u)3)
∂1a(0, u) := 4(w0, 3w1, 3w2, w3) · (u3, u2(1 − u), u(1 − u)2, (1 − u)3)
30
Equation (2–5) can be rewritten in a collection of the following simplifiedforms in terms of
Ui, vi, wi.
3λ0U0 = 4v0 + 4w0 (2–6)
6λ0U1 + 3λ1U0 = 12(v1 + w1) (2–7)
3λ0U2 + 6λ1U1 = 12(v2 + w2) (2–8)
3λ1U2 = 4v3 + 4w3 (2–9)
2.3.1 Derivation ofλ0 and λ1
The scalarλ0 is derived from (2–6). (2–9) sets the constraint forλ1.
Let U0 := (1, 0), V0 := (cos 2πn0
, sin 2πn0
), andW0 := (cos 2πn0
,− sin 2πn0
). (Figure2-7)
We knowu0 = 34U0, u3 = 3
4U2 from degree raising.
v0 + w0 =1
2(3
4V0 +
3
4U0) +
1
2(3
4W0 +
3
4U0)
=3
4(1 + cos 2π
n0
2,sin 2π
n0
2) +
3
4(1 + cos 2π
n0
2,−
sin 2πn0
2)
=3
4(1 + cos
2π
n0
, 0)
=3
4(1 + cos
2π
n0)U0 (2–10)
Hence,4(v0 + w0) = 3(1 + cos 2πn0
)U0
λ0 = (1 + cos 2πn0
).
Similarly, becauseV3 = (1 − cos 2πn1
, sin 2πn1
) andW3 = (1 − cos 2πn1
,− sin 2πn1
),
4(v3 + w3) = 3(1 − cos2π
n1)U2 (2–11)
Hence,λ1 = (1 − cos 2πn1
).
2.3.2 Derivation ofb211 and b121
To derive the formulas forbi211 and its symmetric counterpartbi
121 note that the formulas
must guarantee a smooth transition betweenbi and its neighbor patch on an adjacent quad,
31
Figure 2-7. The re-parameterization ofλ to meetG1 at the vertex
regardless whether the adjacent quad is regular or irregular. That is, the formulas are derived to
satisfysimultaneouslytwo types of smoothness constraints (see Section2.4). From Equation
Ghost patch
Triangular patches
Figure 2-8. Coefficientsb211 andb121 of c-patch is derived on top of a ghost patch.
(2–7), we obtain
b211 + a211 =1
2λ0U1 +
1
4λ1U0 + 2b310 (2–12)
To get a second constraint and determineb211 uniquely, we consider the valuesb∗211 anda∗211 if
each ghost patch in terms ofsin averages (Figure2-8):
4s0(b211 − b310) + 4s1(b211 − b220) = 3(b11 − b10) yields
b211 =4s0b310 + 4s1b220 + 3(f 0
0 − t00)
4(s0 + s1)(2–13)
32
Similarly,
a211 =4s0b310 + 4s1b220 + 3(f 0
n0−1 − t00)
4(s0 + s1)(2–14)
Therefore,
b211 − a211 =3(f 0
0 − e00)
2(s0 + s1)(2–15)
Together with Equation (2–12),
b211 = b310 +1
4λ0(t
11 − t00) +
1
8λ1(t
00 − v0) +
3(f 00 − e0
0)
4(s0 + s1)(2–16)
Equation (2–8) implies
b121 + a121 =1
4λ0U2 +
1
2λ1U1 + 2b130 (2–17)
Using the similar approach as derivingb211, we yield4s0(b121 − b220) + 4s1(b121 − b130) =
3(b21 − b20) yields
b121 =4s1b130 + 4s0b220 + 3(f 1
0 − t11)
4(s0 + s1)(2–18)
Similarly,
a121 =4s1b130 + 4s0b220 + 3(f 1
1 − t11)
4(s0 + s1)(2–19)
(2–18) and (2–19) ⇒
b121 − a121 =3(f 1
0 − e11)
2(s0 + s1)(2–20)
(2–18) and (2–20) ⇒
b121 = b130 +1
8λ0(v1 − t11) +
1
4λ1(t
11 − t00) +
3(f 10 − e1
1)
4(s0 + s1)(2–21)
The formulas (2–21) and (2–21) are the same as shown in Figure2-6.
2.3.3 Derivation ofb112
By contrast,bi112 is not pinned down by continuity constraints. We could choose eachbi
112
arbitrarily without changing the formal smoothness of the resulting surface. However, we opt
for increased smoothness at the center of the c-patch and additionally use the freedom to closely
mimic the shape of Catmull-Clark subdivision surfaces, as we did earlier for vertices. First, we
33
approximately satisfy fourC2 constraints across the diagonal boundaries at the central point b004
(Figure2-9) by enforcing
1 −1 0 0
0 1 −1 0
0 0 1 −1
−1 0 0 1
b0112
b1112
b2112
b3112
=1
2
b0211 − b1
121 − q
b1211 − b2
121 − q
b2211 − b3
121 − q
b3211 − b0
121 − q
, (2–22)
whereq := 14
∑3i=0(b
i211 − bi
121). The perturbation byq is necessary, since the coefficient matrix
of theC2 constraints is rank deficient. After perturbation, the system can be solved with the
last equation implied by the first three. We add the constraint that the average ofbi112 matches
g∗ := g(12, 1
2), the center position of the bicubic patch.
Figure 2-9. Dark lines cover the control points involved in theC2 constraints (2–22). The pointson dashed lines are implied by averaging.
1 −1 0 0
0 1 −1 0
0 0 1 −1
1 1 1 1
b0112
b1112
b2112
b3112
=1
2
b0211 − b1
121 − q
b1211 − b2
121 − q
b2211 − b3
121 − q
8g∗
g∗ lies on the Bicubic patch atu = 0.5 andv = 0.5. The Bicubic control points are given
except interior 4 points, because all the control points on the boundaries are calculated. We can
34
use a mask of determining Bezier control points from a uniform bicubic B-spline surface. Figure
2-10(a) is a mask forb11. For other interior points, we can use a symmetric mask.
Figure 2-10. The center of a bi-cubic patch can be evaluated by the linear combination of theboundary coefficients.
Figure2-10(b) shows a mask for the evaluation of Bicubic patch at(0.5, 0.5).
g∗ =1
64(b00 + 3b01 + 3b02 + b03 + 3b10 + 9b11 + 9b12 + 3b13
+3b20 + 9b21 + 9b22 + 3b23 + b30 + 3b31 + 3b32 + b33)
Now, we can solve for thebi112, i = 0, 1, 2, 3 and obtain the formula of Figure2-6.
2.4 Smoothness Verification
In this section we formally verify the following lemma. For the purpose of the proof, we
view the c-patch in its equivalent representation as four B´ezier patches of total degree 4.
Lemma 1. Two adjacent polynomial piecesa andb defined by the rules of Section2.2(Figure
2-4, Figure2-6, (2–3), (2–4)) meet at least
(i) C2 if a andb correspond to two regular quads;
(ii) C1 if a andb are adjacent pieces of a c-patch;
(iii) C1 if a andb correspond to two quads, exactly one of which is regular;
(iv) with tangent continuity ifa andb correspond to two different irregular quads;
Proof. (i) If a andb are bicubic patches corresponding to regular quads, they are part of a
bicubic spline with uniform knots and therefore meetC2. (ii) If a andb are adjacent pieces of a
c-patch then Equations (2–3) enforceC1 continuity.
35
For the remaining cases, letb be a triangular piece. Letu the parameter corresponding to
the quad edge betweenb400 = v0, whereu = 0 and the valence isN0 andb040 = v1 whereu = 1
and the valence isN1 (Figures2-11for (iii) and 2-12for case (iv)). By construction, the common
boundaryb(u, 0) = a(0, u) is a curve of degree 3 with Bezier control points(v0, t00, t11, v
1) so that
bicubic patches on regular quads and triangular patches on irregular quads match up exactly.
Denote by∂1b the partial derivative ofb along the common boundary and by∂2b the par-
tial derivative in the other variable. Sinceb(u, 0) = a(0, u), we have∂1b(u, 0) = ∂2a(0, u). The
partial derivative in the other variable ofa is ∂2a. We will verify that the following conditions
hold, that imply tangent continuity:
if one quad is ordinary (case (iii)),
∂1b(u, 0) = 2∂2b(u, 0) + ∂1a(0, u); (2–23)
if both quads are extraordinary (case (iv)),
((1 − u)λ0 + uλ1
)∂1b(u, 0) = ∂2b(u, 0) + ∂1a(0, u), (2–24)
whereλ0 := 1 + c0, λ1 := 1 − c
1, andci := cos(
2π
Ni
).
Both equations, (2–23) and (2–24), equate vector-valued polynomials of degree 3 (we write
∂1b(u, 0) in degree-raised form [14]). The equations hold, if and only if all Bezier coefficients
are equal. Off hand, this means checking four vector-valuedequations for each of (2–23) and
(2–24). However, in both cases, the setup is symmetric with respect to reversal of the direction in
which the boundaryb(u, 0) is traversed. That means, we need only check the first two equations
(2–23’) and (2–23”) of (2–23) and the first two equations (2–24’) and (2–24”) of (2–24). We
verify these equations by inserting the formulas of Figures2-4 and2-6.
To verify (2–23), the key observation is thatN0 = N1 = 4 if one quad is ordinary. Hence
c0 = c
1 = 0 ands0 = s
1 = 1 (cf. Figure2-6) andtij = eij . Therefore, for example (cf. Figure
36
Figure 2-11.C1 transition between a triangular and a bicubic patch.
2-11)
2∂2b(0, 0) = 2 · 4(b301 − v0) = 83
4(e00 + e0
1
2− v0)
= 3(e00 + e0
1) − 6v0,
where the factor34
stems from raising the degree from 3 to 4; and the second Bezier coefficient of
∂1b(u, 0) (in degree-raised form) and of2∂2b(u, 0) are respectively (cf. Figure2-11)
3(e0
0 − v0) + 2(e11 − e0
0)
3and
2 · 4(b211 − b310) = 8(e11 − e0
0
4+
e00 − v0
8+ 3
f 0 − e00
8).
Then, comparing the first two Bezier coefficients of∂1b(u, 0) and2∂2b(u, 0) + ∂1a(0, u) yields
equality and establishesC1 continuity:
3(e00 − v0)
︸ ︷︷ ︸
∂1b(0,0)
= 3(e00 + e0
1) − 6v0
︸ ︷︷ ︸
2∂2b(0,0)
−3(e01 − v0)
︸ ︷︷ ︸
∂1a(0,0)
(′)
(e00 − v0) + 2(e1
1 − e00) = 2(e1
1 − e00) + (e0
0 − v0) + 3(f 0 − e00)
− 3(f 0 − e00). (′′)
The equations for (2–24) are similar, except that we need to replaceej by tj and keep in
mind that, by definition,
(t0n0−1 − v0) + (t01 − v0) = 2c0(t00 − v0).
37
Figure 2-12.G1 transition between two triangular patches.
Hence, for example,
∂2b(0, 0) + ∂1a(0, 0) = 4(b301 − v0 + a301 − v0)
=3
44 · 2c
0(t00 − v0).
The first of the four coefficient equations of (2–24) then simplifies to
3(1 + c0)(t00 − v0) = 4(b301 + a301 − 2v0)
= 3(t01 + t00
2− v0 +
tN0−11 + t00
2− v0)
= 31
2(2c
0(t00 − v0) + 2(t00 − v0)). (′)
Noting that terms(f0 − e00)/(8(s0 + s
1)) in the expansions ofb211 anda211 cancel, the second
coefficient equation is
6λ0(t11 − t00) + 3λ1(t
00 − v0) = 12(b211 + a211 − 2b310)
=12 · 2(1 + c
0)
4(t11 − t00) +
12 · 2(1 − c1)
8(t00 − v0). (′′)
It is easy to read off that the equalities hold. So the claim ofsmoothness is verified.
38
2.5 Complexity Analysis
2.5.1 Number of Patches
The conversion scheme yields the minimum set of patches because (1) no initial refinement
for input coarse mesh is needed; (2) each quadrilateral facet of the coarse mesh corresponds to
only one patch. Namely, the total number of patches equals tothe number of facets in the mesh.
The patch complexity of various schemes are compared in Figure1-12.
The low cost of construction and evaluation makes c-patchesan attractive representation,
not just on the GPU
2.5.2 Cost of Patch Construction
The separation into vertex and patch construction means that the number of scaled vertex
additions (adds) per patch is independent of the valence. The cost of computing the control points
per patch, (i.e.), with the cost of vertex computations distributed,is 4 × (4 + 1 + 1 + 2) = 32
adds per bicubic construction and computingtj from t0 andt1 and determiningbi211, bi
121 and
bi112 according to Figure2-6amounts to an additional4 × (2 + 6 + 6 + 12) = 104 adds per
c-patch. Each c-patch has 24 coefficients. This compares favorably to, say [30] where 16+12+12
coefficients are generated.
2.5.3 Cost of Surface Evaluation
The patch can be evaluated at any parametric domain(u, v) using de Casteljau’s algorithm.
A tensor product Bi-cubic Bezier patch is defined by 16 control points. The evaluation at
(u, v) needs 42 vector-vector additions, 42 scaler-vector multiplications, and 42 scaler-scaler
operations. Similarly the evaluation of a c-patch at(u, v) requires 40 vector-vector additions and
60 scaler-vector multiplications. In terms of evaluation cost, a c-patch has roughly the same cost
as a bicubic patch does.
39
2.6 Approximation Catmull-Clark Subdivision Surface
Since Catmull-Clark subdivision is a standard modeling tool, our scheme is designed to
approximate Catmull-Clark Subdivision Surface. In fact, the resulting Bi-cubic patches com-
pletely agree with the Catmull-Clark Subdivision Surface except in the immediate neighborhood
of irregular mesh vertices. In such a neighborhood they joinat least with tangent continuity and
interpolate the limit of the irregular mesh vertex. Furthermore, the center of c-patch interpolates
the center point of the correspondent Catmull-Clark limit surface due to the choice of the c-patch
coefficientb112.
2.7 Water-Tight Surface Verification
Patches are evaluated independently. If the generated vertices along the boundary from the
adjacent patches do not match exactly, the refined mesh will have a hole in it. There are three
configurations for adjacent patches: (1) both are Bi-3 patches, (3) both are c-patches , (2) one of
them is Bi-3 patch.
The coefficients defining the shared boundary curve are derived by the averaging rules
defined in Figure2-4. Since additions are commutative, the generation of all boundary coef-
ficients are independent of the evaluation of the choice of patch. In other words, no round off
error and cracking are possible for the first case. The boundary coefficients of a c-patch are com-
puted by the same rules in Figure2-4, therefore water-tightness are also achieved for the lateral
two cases. Note that computation of the cubic boundaries shared by a bicubic and a c-patch is
mathematically identical.
2.8 Discussion
The introduction of triangular patches to model quad patches is somewhat unconventional,
but has been used in an I3D paper before [15]. Also [49] is based on triangular patches.
Evaluation and normal computation of degree 4 triangular patches is comparable in cost to
40
tensor-product bicubic patches: in the triangular case we have to average 15 control points, in the
tensor-product case 16. Triangular patches may deserve more attention in OpenGL.
41
CHAPTER 3GPU IMPLEMENTATION
3.1 Overview
We implemented the conversion scheme using C++ on DirectX 10pipeline. We compute
vertex neighborhoods according to Figure2-4 in the vertex shader and use the geometry shader
primitive triangle with adjacencyto accumulate the coefficients of the bicubic patch or compute
a c-patch according to Figure2-6. We implemented conversion plus rendering in two variants:a
1-pass and a 2-pass scheme.
3.2 2-pass Approach
Figure 3-1.2-pass implementation detailed in Figure3-2. The first pass converts, the second renders. Note that thegeometry shader only computes at most 24 coefficients per patch and does not create (amplify to)evaluation point primitives.
42
Figure 3-2. 2-Pass conversion: VS=vertex shader, GS=geometry shader, PS=pixel shader. VS Outof Pass 1 outputsN pointsfj for one vertex (hence the subscript) and GS In of Pass 1retrieves four pointsf i, each generated by a different vertex of the quad (hence thesuperscript).
The2-pass implementationconstructs the patches in the first pass using the vertex shader
and the geometry shader and evaluates positions and normalsin the second pass. Pass 1 streams
out only the4 × 6 coefficients of a c-patch and not the4 ×(4+22
)Bezier control points of
the equivalent triangular pieces. The data amplification necessary to evaluate takes place by
instancing a(u, v)-grid on the vertex shader in thesecond pass. That is, wedo not stream back
large data sets after amplification. Position and normal are computed on the(u, v) domain[0..1]2
43
of the bicubic or of the c-patch (not on any triangular domains). We pre-tessellate the quad
domain, and store the results in a set of textures with different resolution. If a tessellation factor
is chosen to bem, the texture with(m + 1) by (m + 1) parametric values will be sent to the
vertex shader in the subsequent evaluation pass. Given the pre-tessellated domain with a patch
identifier, the vertex shader loads the appropriate controlpoints and evaluates the patch. Figure
3-2 lists the input, output and the computations of each pipeline stage. Figure3-1 illustrates this
association of computations and resources. In order to avoid pricy branching in HLSL(High
Level Shader Language) and optimize the performance, specialized shaders are actually written
for patch constructions and evaluation based on the patch type.
3.3 1-pass Approach
In the1-pass implementation, the evaluation immediately follows conversion in the
geometry shader, using the geometry shader’s ability toamplify, (i.e.), output multiple point
primitives for each facet (Figure3-4). While a 1-pass implementation sounds more efficient
than a 2-pass implementation, DX10 limits data amplification in the geometry shader so that the
maximal evaluation density is8 × 8 per quad. Moreover, maximal amplification in the geometry
shader slows the performance. We observed a minimum of25% better performance of the 2-pass
implementation. Figure3.3lists the data flow on the graphics pipeline.
3.4 Coordinate System Transformation
When we evaluate normal and position of an irregular quad at(u, v), we need first
transform the tessellated domain value from a Cartesian coordinate(u, v) to a barycentric
coordinate(s, t, w). Figure3-5 illustrates how to locate which of four triangles where(u, v)
lies on. In this way, we minimize number of comparisons and take care of the shared vertices.
We make(0.0, 0.0), (1.0, 0.0), (0.5, 0.5) only belong toT1, (1.0, 1.0) only belongs toT2, and
(0.0, 1.0) only belongs toT4.
44
Figure 3-3. 1-Pass conversion: VS=vertex shader, GS=geometry shader, PS=pixel shader. GSamplifies the geometry and evaluates the patches.
Figure 3-4.At present, the 1-pass conversion-and-rendering must place patch assembly and evaluation on thegeometry shader. This is not efficient.
45
u
v
(0.5,0.5)
(0.0,1.0) (1.0,1.0)
(1.0,0.0)(0.0,0.0)
T4
T3
T2
T1
Figure 3-5.(u, v) on an irregular quad.
3.5 Water-Tight Evaluation
The HLSL code in Figure3-6 shows that the same cubic curve is evaluated along the
boundary. An explicit if- statement in the evaluation guarantees the exact same ordering of
computations since boundary coefficients are only computedonce,
Figure 3-6. Water-tight Evaluation
46
3.6 Conclusion
The presented approach fits well into a GPU pipeline. In both approaches, we computev,
e, f andt using itsvertex neighborhoodand the rules in Figure2-4 in the vertex shader. Each
vertex has2n + 1 vertices in itsvertex neighborhood, wheren is the valence. This information
is stored in a texture. With a vertex ID and its valence, all vertices in its neighborhood can
be retrieved in counter-clockwised order. In the geometry shader, the patch is finalized and
assembled. Overall, the 2-pass implementation has better performance because of small stream-
out, short geometry shader code and minimal amplification onthe geometry shader.
47
CHAPTER 4RESULTS
4.1 Shape Quality
Our algorithm producesC1 surfaces and they closely approximate Catmull-Clark subdivi-
sion surfaces. We compare our algorithm with [30] on the closeness to Catmull-Clark surfaces.
We measure how the surface is close to Catmull-Clark surfaceby comparing both geometric dif-
ference and normal angle difference. Figure4-1compares the smoothed quad mesh surfaces with
densely refined Catmull-Clark subdivision surfaces based on the same mesh. Both geometric
distance, as percent of the local quad size, and normal distance, in degrees of variation, are com-
pared. Especially after displacement, large models rendered by subdivision and quad smoothing
appear visually indistinguishable. The relatively small examples, without displacement, shown
in Figure4-1 and the close up in Figure4-5 are also important to support our observation that
c-patches do not create shape problems compared to a single bicubic patch: despite the lower
degree and internalC1 join, their visual appearance is remarkably similar to thatof bicubic
patches. The comparison with ACC-patches [30] is shown in4-2. Figures4-3, 4-4 show the
generated smooth surface by our algorithm and the surface after applying displacement mapping
respectively.
Figure 4-1. Comparison between the Catmull-Clark (CC) subdivision limit surface and thesmoothed quad mesh surface for the same input.
48
Figure 4-2. Comparison of ACC-patch and C-patch in terms of approximation of Catmull-Clarksubdivision surfaces for the same input.
Figure 4-3. GPU smoothed quad surfaces: orange patches correspond to ordinary quads, bluepatches to extraordinary quads.
Figure 4-4. GPU smoothed quad surfaces with displacement mapping.
49
4.2 Performance
We compiled and executed the implementation on the latest graphics cards of both
major vendors under DirectX10 and tested the performance for several industry-sized models.
Two surface models and models with displacement mapping areshown in Figure4-3 and
4-4 respectively. Table 4 summarizes the performance of the 2-pass algorithm for different
granularities of evaluation. The frog model, in particular, provides a challenge due to the large
number of extraordinary patches. The Frog Party shown in Figure4-11currently renders at 50
fps for uniform evaluation for N=9, (i.e.), on a9 × 9 grid. That is, the implementation converts
1292 ∗ 9 quads, of which 59% are extraordinary, and renders of 1 million polygons 50 times per
second. On the same hardware, we measured Bunnell’s efficient implementation (distribution
accompanying [9]) featuring the single frog model, (i.e.), 1/9th of the workof the Frog Party,
running at 44 fps with three subdivisions (equivalent to tessellation factor N=9). That is,
Table 4-1. A a total degree 4 patch and a bicubic patch have thesame evaluation cost at(u, v) interms of ALU operations.
evaluation for a c-patch ALU vector opsposition 55normal 3other 1total 59evaluation for a bicubic patch ALU vector opsposition 56normal 3other 0total 59
Table 4-2. Frames per second for some standard test meshes with each patch evaluated on a gridof sizeN × N ; eqs= percentage of extraordinary quads. Sword and Frog are shownin Figure4-3, Head in Figure4-1.
Mesh Frames per second(verts,quads, eqs) N = 5 9 17 33Sword (140,138, 38%) 965 965 965 703Head (602,600, 100%) 637 557 376 165Frog (1308,1292, 59%) 483 392 226 87
50
Figure 4-5. Close-up of the frog. The refined mesh is water-tight.
Table 4-3. Performance of the 1-pass implementation.
Mesh Slower 1-pass implementationN = 2 5 8
Sword 389 96 43Head 108 34 15Frog 44 10 4
GPU smoothing of quad meshes is an order of magnitude faster.Compared to [46], the speed
up is even more dramatic. While the comparison is not among equals since both [46] and [9]
implement recursive Catmull-Clark subdivision, it is nevertheless fair to observe that the speedup
is at least partially due to our avoiding stream back after amplification (data explosion due to
refinement). We expect that more careful storage of vertex neighborhoods, in retrieving order,
will further improve our use of texture cache and thereby improve the frames per second (fps)
count.
4.3 Displacement Mapping
Displacement mapping is a technique for adding geometric details on the mesh with a
height map. It is different from Bump Mapping or Normal Mapping in the sense that it changes
the geometry by moving vertices often along their normal directions according to the value in the
51
height map. The change of real geometry, not just normal for instance in Bump Mapping, permits
self-occlusion. Figure4-6shows the displacement mapping on the frog model which consists of
330k facets. The size of height map is 1024 by 1024.
Figure 4-6. Displacement mapping on the frog model
In order to perturb normals after displacement mapping, we needDu andDv bump
mapping value. The equation to calculate new normals is as follows.
S = P + D ∗ n (4–1)
where, S is the displacement of the point P, D is the displacement and n is the normal of P. Then
the new normal is calculated by the cross product ofSu andSv.
Su = Pu + Du · n + D · nu (4–2)
Sv = Pv + Dv · n + D · nv (4–3)
Note thatnu andnv are the derivatives of the normalized normaln.
nu =n
′
u − n(n′
u · n)
||n||(4–4)
wheren′
u = Puu × Pv + Pu × Puv
4.4 Morphing and Animation
We implement morphing using the 2-pass approach. The animated sequence of the input
meshes in form of textures are fed into the Input Assembler ofthe first pass each frame. The
morphed patches are constructed during the first pass. Fine details are added in the second pass.
The screen shots in Figures4-9, 4-10, 4-11illustrate real time displacement and animation.
52
Figure 4-7. Comparison of the c-patch scheme with PN-Triangles(also called N-patch),ACC-patch, and Catmull-Clark subdivision
Figure 4-8. comparison of the c-patch scheme with PN-Triangles(also called N-patch),ACC-patches, and Catmull-Clark subdivision
53
Figure 4-9. Real time animation on the Sword model.
Figure 4-10. Real time animation on the Frog model.
Figure 4-11. Asynchronous animation of nine Frogs.
54
4.5 Conclusion
Smoothing quad meshes on the GPU offers an alternative to highly refined facet repre-
sentations transmitted to the GPU and is preferable for interactive graphics and integration with
complex morphing and displacement.
We advertised a 2-pass scheme, since, as we argued, the DX10 geometry shader is not
well suited for the data amplification for evaluation after conversion. The 1-pass scheme
outlined in Section3 may become more valuable with availability of a dedicated hardware
tesselator [29, 48]. Such a tesselator will make amplification more efficient and supportadaptive
tessellation(which is why we only discussed uniform tessellation in Section 3). Such a hardware
amplification will also benefit the 2-pass approach in that the (u, v) domain tessellation, fed into
the second pass will be replaced by the amplification unit.
55
CHAPTER 5PATCH CONVERSIONS FOR MESHES WITH TRI/QUAD/PENT FACETS
Our conversion algorithm can be generalized to work for arbitrary meshes. The generalized
algorithm [34] provides an elegant solution for meshes with Tri/Quad/Pent Facets. Removing
restrictions on vertex valences and allowing meshes with triangles, quadrilaterals, and pentagons
vastly simplifies a designer’s task and enriches the design space of meshes for smooth surfaces:
while quads naturally model the flow of (parallel) feature lines and are therefore the main facet
type in models, triangular facets allow merging lines whilepentagonal facets allow to starting
new lines (Figure5-1) – without creating T-corners or forcing refinement of intermediate models
to satisfy connectivity or quad-layout constraints. Essentially, designers can re-use the whole
range of polyhedral models they are used to. We modified the algorithm for converting quad
meshes to a generalized method for a mesh with Tr/Quad/Pent facets. The generalized scheme
converts such a polyhedral model to a surface with everywhere well-defined normal andC2 in
‘regular’ mesh regions with quad-grid connectivity. Figure 5-2shows an example of the resulting
surfaces. Note that the facets are limited to triangles, quads and pentagons due to current GPU
Figure 5-1. (a) Retaining the density of feature lines whilevarying their number. (b),(c) Axehandle detail using a triangle and a pentagon to transition between detailed andcoarser areas.
constraints and to avoid unnecessary notational, technical and shape complexity.
An irregular facet withk sides is converted into a k-patch. A k-patch is a generalization
of a c-patch. It is a piecewise degree 4C1 spline patch withk cubic boundaries. A k-patch
is defined by6k + 1 control points indicated as◦ in Figure5-3(b),(c). That is, the k-patch
corresponding to a triangular, quadrilateral or pentagonal facet is defined by a total of 19, 24 or
31 points respectively.
56
Figure 5-2. The generalized scheme converts a mesh with Tri/Quad/Pent Facets to a smoothsurface consisting of bi-cubic patches (yellow), k-patch withk = 3 (green), k = 4(red), andk = 5 (gray).
Figure 5-3. (a) An ordinary facet is converted to a bi-cubic patch with 16 control pointsgij.(b),(c) An extraordinary facet withk sides is converted to a k defined by6k + 1control points shown as◦. The k can be viewed ask C1-connected degree-4triangular patchesi, i = 0 . . . k−1 with cubic outer boundaries.
Figure 5-4. The triangularsectorsare listed in counter-clockwise order with a modulo-ksuperscript. (a) 14 control points from three consecutive sectors of a k-patch define(b) a single patch in triangular Bezier-form.
57
For evaluation, we can recover the polynomial representation of theith sector in triangular
-form of total-degree 4 (Figure5-3(b) and (c)),
S(u, v) :=∑
i+j+k=4
ijk4!
i!j!k!uivj(1 − u− v)k, (5–1)
where the(4+22
)BB-coefficientsijk ∈ R
3 are indexed as in Figure5-4. Specifically, we compute
the(4+22
)coefficientsijk (Figure5-4(b)) from the 14 coefficients labeled in5-4(a) by simple
averaging: degree-raising the coefficientsi3−l,l,0, l = 0, . . . , 3 to i4−ℓ,ℓ,0, ℓ = 0, . . . , 4
[i400, i310, i220, i130, i040] = [i300,i300+3i210
4, i210+i120
2, 3i120+i030
4, i030]
and computing the shared -coefficients on the sector boundaries i3−,0,1+ = i − 10,3−,1+,
= 0, 1, 2, 3, (i.e.), indices301, 202, 103 and004 in Figure5-4 (b), from the C1 constraints.
Read [34] for a thorough explanation of the algorithm and its GPU implementation,
smoothness verification, etc.
58
CHAPTER 6DISCUSSION AND FUTURE WORK
6.1 Future GPU API
Our conversion scheme not only fits well with the current graphics hardware pipeline,
but also matches very well with the architecture of the future graphics hardware[29, 48]. The
work load currently in the geometry shader will be assigned to the patch shader. The ideal GPU
pipeline needs to explore more parallelism in the geometry shader where 24 coefficients of a
c-patch can be computed independently given the vertex neighborhood. The maximal parallelism
makes the cost of deriving one coefficient roughly equals to the cost of constructing a whole
patch. Currently we precompute the tessellated domain and store these static values in a set
of textures. In the future, this part of computation will be replaced by the tessellation unit.
Animation using our conversion scheme will be achieved in a single pass without geometry
transmission between passes.
6.2 Volume Preservation
Preserving the volume under constraints can achieve a realistic deformable object ani-
mation. The well-known divergence theorem can be used to reduce a volume integral to an an
integral over the surface. Given a closed object, volume is matched to a prescribed value by
inflating or deflating the deformable object uniformly. For enhancing the realism, this method can
be further extended to fix parts of the object and attach different material properties to surface
pieces. This exact, localized volume preservation method works for all surfaces that consists
of Bezier patches. Therefore, we will combine this method with our new surface conversion
algorithm to achieve real-time volume preservation.
6.3 Adaptive Tessellation
The adaptive tessellation samples each surface patch more densely in regions of high
curvature and less densely in regions of low curvature. Moreover it adjusts the level of detail
according to how close the geometry is to the camera. The surface is only tested where and when
59
it’s necessary. Therefore, adaptive tessellated surface will greatly improve the performance. The
tessellation factor can be generated by using the flat test [9]. With the tessellation unit in the
GPU, the cost of tessellating the domain is almost free.
60
REFERENCES
[1] Microsoft DirectX10 SDK. 2008.http://www.microsoft.com/downloads/details.aspx?FamilyId=572BE8A6-263A-4424-A7FE-69CFF1A5B180displaylang=en.
[2] C. Bajaj, J. Chen, and G. Xu. Free form surface design with a-patches. InProceedings ofGraphics Interface 94, pages 174–181, Banff, Alberta, Canada, 1994.
[3] S. Bischoff, L. P. Kobbelt, and H. Seidel. Towards hardware implementation of loopsubdivision. InHWWS ’00: Proceedings of the ACM SIGGRAPH/EUROGRAPHICSworkshop on Graphics hardware, pages 41–50, New York, NY, USA, 2000. ACM Press.
[4] D. Blythe. The Direct3D 10 System. InProceedings of ACM SIGGRAPH 2006, pages724–734, 2006. http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/Direct3D10web.pdf.
[5] M. Bo, M. Amor, M. Doggert, J. Hirche, and W. Strasser. Hardware support for adaptivesubdivision surface rendering, 2001. citeseer.ist.psu.edu/article/boo01hardware.html.
[6] J. Bolz and P. Schroder. Rapid evaluation of Catmull-Clarksubdivision surfaces. InWeb3D’02: Proceeding of the seventh international conference on3D Web technology, pages11–17, New York, NY, USA, 2002. ACM Press.
[7] J. Bolz and P. Schroder. Evaluation of subdivision surfaces on programmable graphicshardware. 2007. http://www.multires.caltech.edu/pubs/GPUSubD.pdf.
[8] T. Boubekeur and C. Schlick. Generic mesh refinement on GPU. In HWWS ’05: Proceed-ings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, pages99–104, New York, NY, USA, 2005. ACM.
[9] M. Bunnell. GPU Gems 2: Programming Techniques for High-Performance Graphics andGeneral-Purpose Computation, chapter 7. Adaptive Tessellation of Subdivision Surfaceswith Displacement Mapping. Addison-Wesley, Reading, MA, 2005.
[10] E. Catmull and J. Clark. Recursively generated B-spline surfaces on arbitrary topologicalmeshes.Computer Aided Design, 10:350–355, 1978.
[11] R. L. Cook.Shade trees. ACM, New York, NY, USA, 1998.
[12] M. S. D. Doo. Behaviour of recursive division surfaces near extraordinary points.ComputerAided Design, 10:356–360, 1978.
[13] T. DeRose, M. Kass, and T. Truong. Subdivision surfaces in character animation. InSIGGRAPH ’98: Proceedings of the 25th annual conference on Computer graphics andinteractive techniques, pages 85–94, New York, NY, USA, 1998. ACM Press.
[14] G. Farin. Curves and surfaces for computer aided geometric design: a practical guide.Academic Press Professional, Inc., San Diego, CA, USA, 1988.
61
[15] C. Gonzalez and J. Peters. Localized hierarchy surface splines. In S. S. J. Rossignac, editor,ACM Symposium on Interactive 3D Graphics, pages 7–15, 1999.
[16] M. Guthe,A. Balazs, and R. Klein. GPU-based trimming and tessellation of NURBS andT-spline surfaces.ACM Transactions on Graphics, 24(3):1016–1023, 2005.
[17] M. Guthe, A. Balazs, and R. Klein. GPU-based trimming and tessellation of NURBS andT-spline surfaces.ACM Trans. Graph., 24(3):1016–1023, 2005.
[18] M. Halstead, M. Kass, and T. DeRose. Efficient, fair interpolation using Catmull-Clarksurfaces.Proceedings of SIGGRAPH 93, pages 35–44, Aug 1993.
[19] H. Hoppe, T. DeRose, T. Duchamp, M. Halstead, H. Jin, J. McDonald, J. Schweitzer, andW. Stuetzle. Piecewise smooth surface reconstruction.Computer Graphics, 28(AnnualConference Series):295–302, 1994.
[20] D. L. James and C. D. Twigg. Skinning mesh animations. InSIGGRAPH ’05: ACMSIGGRAPH 2005 Papers, pages 399–407, New York, NY, USA, 2005. ACM.
[21] K. Karciauskas and J. Peters. Guided subdivision, 2005.http://www.cise.ufl.edu/research/SurfLab/papers.shtml.
[22] O. A. Karpenko and J. F. Hughes. Smoothsketch: 3d free-form shapes from complexsketches.ACM Transactions on Graphics, 25/3:589–598, 2006.
[23] L. Kavan, C. O’Sullivan, and J. Zara. Efficient collision detection for spherical blendskinning. InProceedings of the 4th international conference on Computer graphics andinteractive techniques in Australasia and Southeast Asia table of contents, Kuala Lumpur,Malaysia, pages 147–156, 2006.
[24] L. Kavan and J.Zara. Spherical blend skinning: a real-time deformation of articulatedmodels. InI3D ’05: Proceedings of the 2005 symposium on Interactive 3Dgraphics andgames, pages 9–16, New York, NY, USA, 2005. ACM.
[25] A. Krishnamurthy, R. Khardekar, and S. McMains. Direct evaluation of nurbs curves andsurfaces on the GPU. InSPM ’07: Proceedings of the 2007 ACM symposium on Solid andphysical modeling, pages 329–334, New York, NY, USA, 2007. ACM.
[26] S. Lai and F. F. Cheng. Adaptive rendering of catmull-clark subdivision surfaces. InCAD-CG ’05: Proceedings of the Ninth International Conference on Computer Aided Design andComputer Graphics, pages 125–132, Washington, DC, USA, 2005. IEEE Computer Society.
[27] A. Lee, H. Moreton, and H. Hoppe. Displaced subdivision surfaces. In K. Akeley,editor,Siggraph 2000, Computer Graphics Proceedings, pages 85–94. ACM Press / ACMSIGGRAPH / Addison Wesley Longman, 2000. citeseer.ist.psu.edu/lee00displaced.html.
[28] A. Lee, H. Moreton, and H. Hoppe. Displaced subdivision surfaces. In K. Akeley, editor,Siggraph 2000, Computer Graphics Proceedings, Annual Conference Series, pages 85–94.ACM Press / ACM SIGGRAPH / Addison Wesley Longman, 2000.
62
[29] M. Lee. Next-generation graphics programming on xbox 360, 2006.http://download.microsoft.com/download/d/3/0/d30d58cd-87a2-41d5-bb53-baf560aa2373/Next GenerationGraphicsProgrammingon Xbox 360.ppt.
[30] C. Loop and S. Schaefer. Approximating Catmull-Clark subdivision surfaces with bicubicpatches. Technical report, Microsoft Research, MSR-TR-2007-44, 2007.
[31] C. T. Loop. Smooth subdivision surfaces based on triangles,1987. Master’s Thesis,Department of Mathematics, University of Utah.
[32] A. Mohr, L. Tokheim, and M. Gleicher. Direct manipulation ofinteractive character skins.In I3D ’03: Proceedings of the 2003 symposium on Interactive 3Dgraphics, pages 27–30,New York, NY, USA, 2003. ACM.
[33] K. Muller and S. Havemann. Subdivision surface tesselationon the fly using a versatilemesh data strucure, 2000. citeseer.ist.psu.edu/muller00subdivision.html.
[34] A. Myles, T. Ni, and J. Peters. GPU-friendly smooth surfacesfrom meshes withtri/quad/pent facets. InSymposium on Geometry Processing, July 2 - 4, 2008, Copen-hagen, Denmark, pages 1–8. Blackwell, 2008.
[35] A. Myles, Y. Yeo, and J. Peters. GPU conversion of quad meshesto smooth surfaces.In D. Manocha, B. Levy, and H. Suzuki, editors,ACM Solid and Physical ModelingSymposium, June 2 - 4, 2008,Stony Brook University, Stony Brook, New York, USA, pages321–326. ACM Press, 2008.
[36] A. Nealen, T. Igarashi, O. Sorkine, and M. Alexa. Fibermesh:designing freeform surfaceswith 3d curves.ACM Trans. Graph., 26(3), 2007.
[37] T. Ni, Y. Yeo, A. Myles, V. Goel, and J. Peters. GPU smoothing of quad meshes. InM. Spagnuolo, D. Cohen-Or, and X. Gu, editors,IEEE International Conference on ShapeModeling and Applications, June 4 - 6, 2008, Stony Brook University, Stony Brook, NewYork, USA, pages 3–10. ACM Press, 2008.
[38] J. Peters. Patching Catmull-Clark meshes. In K. Akeley, editor, Siggraph 2000, ComputerGraphics Proceedings, Annual Conference Series, pages 255–258. ACM Press / ACMSIGGRAPH / Addison Wesley Longman, 2000.
[39] J. Peters. Geometric continuity. InHandbook of Computer Aided Geometric Design, pages193–229. Elsevier, 2002.
[40] J. Peters and A. Nasri. Computing volumes of solids enclosedby recursive subdivisionsurfaces.Computer Graphics Forum, 16(3):C89–C94, 1997.
[41] J. Peters and U. Reif. Analysis of generalized B-spline subdivision algorithms.SIAMJournal on Numerical Analysis, 35(2):728–748, Apr. 1998.
[42] H. Prautzsch. Freeform splines.Computer Aided Geometric Design, 14(3):201–206, 1997.
63
[43] H. Prautzsch, W. Boehm, and M. Paluzny.Bezier and B-Spline Techniques. Springer Verlag,2002.
[44] K. Pulli and M. Segal. Fast rendering of subdivision surfaces. InSIGGRAPH ’96: ACMSIGGRAPH 96 Visual Proceedings: The art and interdisciplinary programs of SIGGRAPH’96, page 144, New York, NY, USA, 1996. ACM.
[45] S. Schaefer and J. Warren. Exact evaluation of non-polynomial subdivision schemes atrational parameter values. InPG ’07: Proceedings of the 15th Pacific Conference onComputer Graphics and Applications, pages 321–330, Washington, DC, USA, 2007. IEEEComputer Society.
[46] L.-J. Shiue, I. Jones, and J. Peters. A realtime GPU subdivision kernel. In M. Gross,editor,Siggraph 2005, Computer Graphics Proceedings, Annual Conference Series, pages1010–1015. ACM Press / ACM SIGGRAPH / Addison Wesley Longman, 2005.
[47] J. Stam. Exact evaluation of Catmull-Clark subdivision surfaces at arbitrary parametervalues. InSIGGRAPH, pages 395–404, 1998.
[48] A. Tatarinov. Instanced tessellation in directx10, 2008.http://www.microsoft.com/downloads/details.aspx?FamilyId=572BE8A6-263A-4424-A7FE-69CFF1A5B180displaylang=en.
[49] A. Vlachos, J. Peters, C. Boyd, and J. L. Mitchell. Curved PN triangles. In2001,Symposium on Interactive 3D Graphics, Bi-Annual Conference Series, pages 159–166.ACM Press, 2001.
[50] D. Zorin. Subdivision for modeling and animation.ACM SIGGRAPH Course Notes, 2000.
64
BIOGRAPHICAL SKETCH
Tianyun Ni was born in Nanjing, China. She was awarded her BS in computer science with
mathematics minor from Texas State University in 2000 and her ME in computer engineering
from University of Florida in 2002. She earned her doctoral degree in computer graphics field in
2008.
65