Download - Visibility Driven Out-of-Core HLOD Rendering
Visibility Driven Out-of-Core HLOD RenderingPatrick CozziThe University of Pennsylvania
00000000 of 01010110
Project History
Procedurally generated model of Pompeii: ~1.4 billion polygons.Image from [Mueller06]
00000001 of 01010110
Project History
Boeing 777 model: ~350 million polygons.Image from http://graphics.cs.uni-sb.de/MassiveRT/boeing777.html
00000010 of 01010110
Contents
Previous Work View Frustum and Occlusion Culling
Hardware Occlusion Queries (HOQ) Level of Detail (LOD)
Hierarchical Level of Detail (HLOD) Out-of-Core Rendering (OOC)
00000011 of 01010110
Contents Continued
Implementation Work Vertex Clustering [Rossignac93] HLOD Tree Creation Primary Contribution: OOC Rendering
Results Future Work Demos throughout
00000100 of 01010110
View Frustum Culling
Can be slower than brute force. When?
culled
rendered
culled
culled
rendered
rendered
00000101 of 01010110
View Frustum Culling
0
1
2
3
45
0 1
3 42
5
00000110 of 01010110
View Frustum Culling
0
1
2
3
45
0 1
3 42
5
00000111 of 01010110
View Frustum Culling
Demo
00001000 of 01010110
Occlusion Culling
Effective in scenes with high depth complexity
culled
0001001 of 01010110
Occlusion Culling
From-region or from-point Most are conservative Occluder Fusion Difficult for general scenes with
arbitrary occluders. So make simplifying assumptions: [Wonka00] – urban environments [Ohlarik08] – planets and satellites
00001010 of 01010110
Hardware Occlusion Queries
From-point visibility that handles general scenes with arbitrary occluders and occluder fusion
How? Use the GPU
00001011 of 01010110
Hardware Occlusion Queries
Disable color and depth write
Color Buffer Depth Buffer
00001100 of 01010110
Hardware Occlusion Queries
Disable color and depth write Render BV using HOQ
00001101 of 01010110
Hardware Occlusion Queries
Disable color and depth write Render BV using HOQ Enable color and depth writes
Color Buffer Depth Buffer
0001110 of 01010110
Hardware Occlusion Queries
Disable color and depth write Render BV using HOQ Enable color and depth writes Render object based on HOQ
results
00001111 of 01010110
Hardware Occlusion Queries
class IQueryOcclusion{public: virtual void Begin() = 0; virtual void End() = 0; virtual bool IsResultAvailable() = 0;
virtual unsigned int NumberOfSamplesPassed() = 0; virtual unsigned int NumberOfFragmentsPassed() = 0;};
00010000 of 01010110
Hardware Occlusion Queries
class IQueryOcclusion{public: virtual void Begin() = 0; virtual void End() = 0; virtual bool IsResultAvailable() = 0;
virtual unsigned int NumberOfSamplesPassed() = 0; virtual unsigned int NumberOfFragmentsPassed() = 0;};
00010000 of 01010110
Hardware Occlusion Queries
class IQueryOcclusion{public: virtual void Begin() = 0; virtual void End() = 0; virtual bool IsResultAvailable() = 0;
virtual unsigned int NumberOfSamplesPassed() = 0; virtual unsigned int NumberOfFragmentsPassed() = 0;};
00001000 of 01010110
Hardware Occlusion Queries
CPU stalls and GPU starvation
Draw o1 Draw o2 Draw o3
Draw o1 Draw o2 Draw o3
CPU
GPU
Query o1
Query o1
Draw o1
Draw o1
-- stall --
-- starve --
CPU
GPU
00010001 of 01010110
Is Culling Enough?
00010010 of 01010110
Is Culling Enough?
Now what?0001011 of 01010110
Is Culling Enough?
Demo
00010100 of 01010110
Level of Detail
Generation: less triangles, simpler shader
Selection: distance, pixel size Switching: avoid popping
Discrete, Continuous, Hierarchical
00010101 of 01010110
Discrete LOD
3,086 Triangles 52,375 Triangles 69,541 Triangles
00010110 of 01010110
Discrete LOD
Demo
00010111 of 01010110
Discrete LOD
Not enough detail up close
Too much detail in the distance
00011000 of 01010110
Continuous LOD
edge collapse
vertex split
Image from [Luebke01]
00011001 of 01010110
Hierarchical LOD
1 Node3,086 Triangles
4 Nodes9,421 Triangles
16 Nodes77,097 Triangles
00011010 of 01010110
Hierarchical LOD
1 Node3,086 Triangles
4 Nodes9,421 Triangles
16 Nodes77,097 Triangles
00011011 of 01010110
Hierarchical LOD
visit(node) { if (computeSSE(node) < pixel tolerance) { render(node); } else { foreach (child in node.children) visit(child); } }
Node Refinement
00011100 of 01010110
Hierarchical LOD
00011101 of 01010110
Hierarchical LOD
New Problem: Cracks
00011110 of 01010110
Hierarchical LOD
Demo
00011111 of 01010110
HLOD + Culling
visit(node){ if (node overlaps view frustum) { // ... }}
00100000 of 01010110
HLOD + Culling
visit(node){ if (node overlaps view frustum) { render node’s BV with HOQ if (query.NumberOfFragmentsPassed() > 0) { // ... } }}
Render front to back!
00100001 of 01010110
HLOD + Culling + VMSSEvisit(node){ if (node overlaps view frustum) { render node’s BV with HOQ if (query.NumberOfFragmentsPassed() > 0) { if (computeVMSSE(node, query) < tolerance) { render(node); } else { // ... } } }}
00100010 of 01010110
VMSEE: Virtual Multiresolution SSE Relative Visibility =
# pixels visible / # possible pixels visible
VMSSE = f(SSE, Relative Visibility)
VMSSE
00100011 of 01010110
Optimized HLOD Refinement Driven by HOQs [Charalambos07]
Exploit spatial and temporal coherence for scheduling HOQs.
Predict refinement based on node’s relative visibility from previous frame
VMSSEiest = SSEi * biasi-1
00100100 of 01010110
Optimized HLOD Refinement Driven by HOQs [Charalambos07]
Example prediction Refinement stopped for this node in
previous frame VMSSEi
est < threshold ? Stop : Refine Stop:
Issue query Render without checking query
00100101 of 01010110
Implementation Work
3 HLOD algorithms including [Charalambos07]
Vertex Clustering HLOD Tree Creation OOC Rendering
Load/Unload Rules Rendering Replacement Policy Multithreading
00100110 of 01010110
Vertex Clustering [Rossignac93]
Fast: expected O(n) Robustness: arbitrary topology Capable of drastic simplification “Easy to code” OOC extensions [Lindstrom00]
00100111 of 01010110
Vertex Clustering [Rossignac93]
1. Compute per-vertex weights
11
0.8
0.50.5
2. Assign vertices to clusters
3. Identify highest weighted
vertex in each cluster
00100111 of 01010110
Vertex Clustering [Rossignac93]
1. Compute per-vertex weights
11
0.8
2. Assign vertices to clusters
3. Identify highest weighted
vertex in each cluster
4. Collapse and remove
degenerate triangles
00101000 of 01010110
Vertex Clustering [Rossignac93]
3,086 Triangles 52,375 Triangles 69,541 Triangles
00101001 of 01010110
Vertex Clustering [Rossignac93]
Questionable Fidelity Hard to control output Conservative Error Metric
00101010 of 01010110
HLOD Tree Creation
Input Model (.ply, .obj) Target triangles per leaf node Maximum tree depth
Output 1 file per node Normals computed at runtime
00101011 of 01010110
HLOD Tree Creation
Top-down Root node:
Full AABBLowest Detail
00101100 of 01010110
HLOD Tree Creation
Splitting Planes
2 Planes 3 Planes
00101101 of 01010110
HLOD Tree Creation
Splitting Planes
00101110 of 01010110
HLOD Tree Creation
00101111 of 01010110
visit(node) { if ((computeSSE(node) < pixel tolerance) || (not all children resident)) { render(node); foreach (child in node.children) requestResidency(child); } else { foreach (child in node.children) visit(child); } }
Previous Work: Out-of-Core
Based on [Ulrich02]
Prefetch
Need all children
To render
To refine
00110000 of 01010110
Previous Work: Out-of-Core
[Varadhan02] Requires full skeleton in memory No occlusion culling No front-to-back sorting
Image From [Varadhan02]
00110001 of 01010110
Previous Work: Out-of-Core
[Corrêa03] PLP in separate thread Requires full skeleton in memory No LOD
00110010 of 01010110
Out-of-Core
Replacement Policy? LRU? Can’t refine when one child is removed
Remove deepest child in parent’s tree?
00110011 of 01010110
OOC Rendering
Benefits of our algorithm No full HLOD skeleton Works with HOQs Refinement with a subset of children Replacement policy maximizes detail
near the viewer Multithreaded
00110100 of 01010110
OOC Rendering: Load/Unload
HLOD tree on disk
00110101 of 01010110
OOC Rendering: Load/Unload
Subset of HLOD tree in memory
00110110 of 01010110
OOC Rendering: Load/Unload
Load node -> load children skeletons
00110111 of 01010110
OOC Rendering: Load/Unload
Only unload dynamic leafs
00111000 of 01010110
OOC Rendering: Load/Unload
Only unload dynamic leafs
00111001 of 01010110
OOC Rendering: Load/Unload
Nodes don’t need all their children in memory
00111010 of 01010110
OOC Rendering: Load/Unload
Result: If a node is not a skeleton, none of its
ancestors are skeletons. In other words, if a node has geometry loaded, so does all of its ancestors.
00111011 of 01010110
OOC Rendering: Load/Unload
Never Happens:
00111100 of 01010110
OOC Rendering: Rendering
Modify in-core HLOD Add request queue:
Stop refinement at skeleton node Push node onto request queue Ensure parent safety Render subset of parent’s geometry
00111101 of 01010110
OCC Rendering: Subset of Parent
Use OpenGL clipping planes
00111110 of 01010110
OCC Rendering: Subset of Parent
Without clipping planes
00111111 of 01010110
OCC Rendering: Subset of Parent
Demo
01000000 of 01010110
OCC Rendering: Node Replacement
Replacement List (only dynamic leafs)
01000001 of 01010110
OCC Rendering: Node Replacement
Replacement List Partitions
01000010 of 01010110
OCC Rendering: Node Replacement
Start Frame
01000011 of 01010110
OCC Rendering: Node Replacement
Add Node
01000100 of 01010110
OCC Rendering: Node Replacement
Render Node
01000101 of 01010110
OCC Rendering: Node Replacement
Move to safety
01000110 of 01010110
OCC Rendering: Node Replacement
Suggest Removal Node
01000111 of 01010110
OCC Rendering: Multithreading
01001000 of 01010110
Low Memory
Demo
01001000 of 01010110
Selected Results (lol)
Load Time 10 Blocks in Pompeii 5,646,041 triangles
Time in seconds
Full model 5.2
Out-of-Core 0.05
01001010 of 01010110
Selected Results
View 1 View 2
Zoomed out rendering
01001011 of 01010110
Selected Results
View 1 View 2
Brute Force 63 fps5,646,041 triangles
63 fps5,646,041 triangles
HLOD - SSE 1,415 fps161,742 triangles
881 fps302,337 triangles
HLOD - Naive VMSEE 1,060 fps140, 458 triangles
300 fps260,007 triangles
HLOD - Scheduled VMSSE
1,176 fps140, 458 triangles
588 fps270,774 triangles
Zoomed out rendering
01001100 of 01010110
Selected Results
Zoomed In Rendering
View 3 View 4
01001101 of 01010110
Selected Results
Zoomed In Rendering
View 3 View 4
01001110 of 01010110
Selected Results
Zoomed In RenderingView 3 View 4
Brute Force 62 fps5,646,041 triangles
62 fps5,646,041 triangles
HLOD - SSE 128 fps2,541,434 triangles
98 fps3,222,701 triangles
HLOD - Naive VMSEE 180 fps346,901 triangles
320 fps46,765 triangles
HLOD - Scheduled VMSSE
210 fps601,730 triangles
232 fps103,844 triangles
01001111 of 01010110
Statistics
Lines of Code GUI: 420 Unit Tests: 1,720 HLOD Creation: 4,600 Rendering: 4,500
Time Spent Coding: 8 weeks “fulltime.” 3 last
spring, 5 this fall. Plus reading, writing, slides, and
logistics.
01010000 of 01010110
Future Work
Improve tree creation Polygonal simplification Splitting planes
Fill cracks Optimal disk layout Better occlusion performance
Multiple volumes or occlusion-preserving low LOD
Optimize use of clipping planes
01010001 of 01010110
Future Work
Don’t require ancestors to have geometry loaded. Much better use of memory More complicated rendering More rendering artifacts
01010010 of 01010110
Future Work
Cache Management Aggressively remove nodes Replacement Policy: Average detail
instead of best up close
01010011 of 01010110
Future Work
Multithreading Multiple load threads
Fault tolerance, increase throughput Compute thread(s)
Compute normals Decompress (/ recompress) Vertex cache optimize?
01010100 of 01010110
Future Work
True Usefulness Textures Picking on individual objects Test with truly massive models
01010101 of 01010110
Future Work
Today Mad Mex Hour Happy. Now – 6:30pm
Saturday, February 7th
Graduation Party. My House. 3pm.
01010110 of 01010110