interactive ray tracing #2 peter djeu april 22, 2003
TRANSCRIPT
![Page 1: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/1.jpg)
Interactive Ray Tracing #2
Peter Djeu
April 22, 2003
![Page 2: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/2.jpg)
Interactive Ray Tracing
S. Parker, W. Martin, P. Sloan, P. Shirley, B. Smits, C. Hansen
The University of Utah
![Page 3: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/3.jpg)
Goals• Implement a brute force interactive ray
tracer in software for the SGI Origin 2000– hardware renderers are inflexible, whereas
software renderers can be extended and re-tested with new algorithms
• Study and try to compensate for the problems that come with interactive ray tracing (e.g. lighting, shadows, splines)
![Page 4: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/4.jpg)
Why is Ray Tracing Appealing?1. It scales well (keep throwing processors at
the problem until the renderer is interactive)
2. Rendering time is sub-linear in the number of primitives in the scene (unlike rasterization, which is linear)
3. Ray tracing is more flexible and the image quality is better (ex: more primitives are allowed, ray tracing generates shadows, highlights, transparency)
![Page 5: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/5.jpg)
Overview of the System (page 1)
![Page 6: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/6.jpg)
Overview of the System (page 2)
![Page 7: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/7.jpg)
Rendering Mode 1:Conventional Mode
• Create a static set of ray bundles (called jobs) where the job size spans a range of sizes. Use first-come first-served, assign larger jobs first, and towards the end the smaller job size will cause load-balancing.
• When all bundles have been processed, display the current frame.
![Page 8: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/8.jpg)
Conventional Mode: Diagram
![Page 9: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/9.jpg)
The Nitty Gritty of Conventional Mode
• Use job sizes that are multiples of 128 bytes. This is because the machine has 128 byte cache lines (no false sharing).
• Use the Origin’s fetch and op instruction for as a fast synchronization tool– 61 sec on Origin vs. 6 msec on Irix, a big
difference
![Page 10: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/10.jpg)
Thoughts on Conventional Mode
• Good: The algorithm is very simple, but still achieves load balancing.
• Bad: How many bundles should be created? How large should the largest bundles be? Other limits to a static algorithm…
• Bad: How are rays grouped into bundles? Do the bundles respect locality (like in Pharr’s paper)?
![Page 11: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/11.jpg)
Rendering Mode 2: Frameless
• Assign a set of pixels to each processor.• Each processor will compute its set of
pixels as fast as it can, but the screen will be updated at an independent rate.
• We get guaranteed framerate at the cost of inconsistent image quality.
![Page 12: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/12.jpg)
Frameless Rendering: Diagram
![Page 13: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/13.jpg)
Thoughts on Frameless Mode
• The two competing goals for locality are interesting: when assigning pixels, strong locality means better cache utilization, but strong locality also means that an entire portion of the screen may have a noticeable artifact if that particular processor is overburdened.– How do find a balance?
– Or does such a tradeoff completely kill frameless rendering?
• Where are the pictures of frameless mode?
![Page 14: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/14.jpg)
The Transition to Interactive
• A static ray tracer can use a variety of hacks which do not apply to interactive ray tracers (ex: lighting and object’s material hacked based on viewpoint)
• The authors propose some new ray tracing techniques to improve: lighting, material models, shadows, intersection computations.
![Page 15: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/15.jpg)
Whitted Lighting and a Material Model with Categories
• A robust lighting model with coefficients that determine the category and visual appearance of materials. For efficiency, coefficients that are not needed are set to 0:– Diffuse: no highlights nor specular, just diffuse
– Metal: no diffuse, just specular and highlights
– Dielectric: (ex: glass, water), formula used for specular coefficients
– Polished: complex formula used for overall color
![Page 16: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/16.jpg)
Ambient Lighting
• Problem: Ambient light is usually hacked in to a ray tracer so that points not directly in light are lit up. However, if they face away from the light, these regions appear flat (no real ray tracing occurs).
• Solution: assign a color to “fully facing the light” and a color to “fully facing away from the light.” For all surfaces, find an interpolated color. No need for additional light rays.
![Page 17: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/17.jpg)
Directionally VaryingAmbient Lighting
• Problem: Ambient light is usually hacked in to a ray tracer so that points not directly in light are lit up. However, if they face away from the light, these regions appear flat (no real ray tracing occurs).
• Solution: assign a color to “fully facing the light” and a color to “fully facing away from the light.” For all surfaces, find an interpolated color. No need for additional light rays.
![Page 18: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/18.jpg)
Directionally VaryingAmbient Lighting in Action
![Page 19: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/19.jpg)
Inner / Outer Object Shadowsto Approximate Area Lights
• Problem: Realistic shadows have an umbra and a penumbra created by area lights, not hard shadows
• Solution: treat an area light as a point. Based on the size / shape of the light, construct an inner and outer object for the shadow caster. Create two shadow regions, and interpolate the transparency between them to simulate the soft shadow.
![Page 20: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/20.jpg)
Diagram of Inner / Outer Objects
![Page 21: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/21.jpg)
Picture of Soft Shadows in Practice
![Page 22: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/22.jpg)
Subdividing Spline Surfaces
• Problem: Usually, splines are tessellated in ray tracers, which means there can be an explosion in memory usage and / or pipeline saturation
• Solution: use a bottom up technique of creating bounding volumes, then compute intersections using Broyden’s method. This is fast (~ 3 iterations / query) and is memory efficient.
![Page 23: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/23.jpg)
Results: page 1
![Page 24: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/24.jpg)
Results: page 2
![Page 25: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/25.jpg)
Results: page 3
• Rendering Room scene (small scene): 9.4 Mb/s• Rendering the Female Dataset (large scene): 2.1
Mb/s to 8.4 Mb/s, note this is less than before• Most scenes fit within 4Mb of secondary cache
• Dynamic (aka moving) objects: no acceleration, they are processed using the standard algorithm
• Depth complexity has little effect on rendering speed.
![Page 26: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/26.jpg)
Critique of Results (page 1)
• Why does one chart go up to 64 proc.’s, while the other have a max of 128 proc.’s?
• How was the ideal performance calculated? Note that the ideal line is NOT the same on both charts: (64, 7x) vs. (64, 10x)
• Why is there a drop-off in both charts?– Any ideas?
![Page 27: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/27.jpg)
Critique of Results (page 2)• No attempt was made to explain why a larger
scene “ironically” uses less memory bandwidth than a smaller scene.– Coherence? Occlusion? Something else?
• Will most scenes in the future fit within the secondary cache (4 Mb in this case)?
• The authors mainly address the making of an interactive ray tracer for static scenes. The results presented seem more like an afterthought.
![Page 28: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/28.jpg)
Conclusions
• There is still much work to be done in the world of ray tracing, including:– anti-aliasing, dynamic scenes, performance
guarantees, API creation, hardware
• Creating (and using) better ray tracers means that we will be better able to focus our efforts for future work– usefulness of soft shadows, BRDF’s, reflect’s
![Page 29: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/29.jpg)
State of the Art inInteractive Ray Tracing
I. Wald and P. Slusallek
Saarland University, Germany
![Page 30: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/30.jpg)
Goals
• Create a survey of contemporary raytracing. Topics include:– the weaknesses of rasterization– different ways to ray trace– ray tracing on different platforms
(supercomputers, PC’s, PC clusters)– recent research
• Talk about their research
![Page 31: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/31.jpg)
Problems with Rasterization
• With respect to the number of polygons in the scene, the complexity if O(n) rather than O(log n)
• Hard to scale to parallel architectures because of high communication needs
• Hard to incorporate a shader into the pipeline
![Page 32: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/32.jpg)
Another look at Rasterization
• O(n) rather than O(log n)– since when has O(n) been a problem in terms of
scalability? Of course, O(log n) is better, but…
• Hard to scale to parallel architectures
• Hard to incorporate a shader– is this still true when we have shader languages
such as Cg and MS Cg?
![Page 33: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/33.jpg)
Benefits of Ray Tracing
• Flexible– different types of rays, different primitives
• O(log n)• Shading only done on visible components• Shaders easier to add (no pipeline) (?)• Correct reflections, refractions• Parallel and Scalable• Coherent when using a Pharr-like algorithm
![Page 34: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/34.jpg)
Raytracing is faster, but….
• Almost all tests currently use just primary rays.
• Shadows and reflections will drop frame rate by a constant factor.
• Acceleration structures like BSP trees are the source for the speed, but they are heavily dependent on static scenes. They do not (currently) support dynamic objects.
![Page 35: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/35.jpg)
Different Forms of Ray Tracing
1. Rasterization-Based – do a quick rasterization pass, and then add ray traced effects (artifacts)
2. Image-Based – kind of like frameless rendering from Parker’s paper (artifacts)
3. Approximation-Based – sample certain regions and interpolate (artifacts)
4. Acceleration-Based – construct fast-culling data structures, exploit coherence, respect the memory hierarchy (no artifacts?)
![Page 36: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/36.jpg)
Approximate Ray Tracing
• Main idea: the visual feedback from interactivity (i.e. frame rate) can often be more important than visual correctness– ex: Sonic 2 and Blast Processing
• Examples:– Rasterize, then use corrective textures for highlights
– the RenderCache reuses rays within error bound (however, this is great for off-line global illumination)
– the Holodeck keeps all generated rays on disk, reuses
![Page 37: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/37.jpg)
Perceptually Guided Corrective Texturing
![Page 38: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/38.jpg)
Ray Tracing Platform 1: Supercomputers
• Using a 96-proc. SGI PowerChallenge, Muuss was able to ray trace a scene that could not be rasterized (1995)
• Parker et al. used an SGI Origin 2000 to create a ray tracer that could support triangle and non-triangle scenes (1999)
![Page 39: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/39.jpg)
Ray Tracing Platform 2:Desktop PC’s
• Why?– Supercomputers are rare, while PC’s are
everywhere– Work for stand-alone PC’s could lead to
efficient ray tracers on cluster PC’s
• Challenges of using a CPU– reduce branches and complexity, respect the
memory hierarchy, reduce memory bandwidth
![Page 40: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/40.jpg)
Points to Note on the Desktop PC Implementation
• Shading takes up far less than 10% of the total rendering time
• SIMD CPU instructions (aka vector ops) produce only a 2x speedup– Do 4 rays on one tri., not 4 tri.’s on 1 ray
• Wald’s implementation was compared to freely available POV-Ray and Rayshade– 11x – 15x speedup
![Page 41: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/41.jpg)
Table 2 (less is more)
![Page 42: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/42.jpg)
Ray Tracing vs. Rasterization (on Desktop PC’s)
• We can already achieve the crossover point (see bottom row of the next slide)
• SGI Performer (a rasterizer) running on powerful desktops is comparable to ray tracing on a more modest desktop
![Page 43: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/43.jpg)
Table 3 (bigger is better)
![Page 44: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/44.jpg)
Conclusions from Desktop PC Results
• Raytracing has a high startup cost per ray, but…
• It scales well as scenes get more complex, a crossover point exists regardless of screen resolution
• You can do correct reflections, etc.
![Page 45: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/45.jpg)
Figure 8
![Page 46: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/46.jpg)
Ray Tracing Platform 3:Clusters of PC’s
• Because ray tracing is embarrassingly parallel, let’s try to build a cheap PC cluster-based ray tracer
• Challenges:– no shared memory on PC clusters
• Setup: a scheduling machine, a display machine, and lots of processing machines
![Page 47: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/47.jpg)
Data Management in the Cluster World
• An NFS based data fetch system blocks on a data miss, and this is too costly
• Instead, the scene cache is managed in software via an asynchronous loader thread, a ray is suspended until its data arrives
• Compression-Decompression is used for voxels transferred over the network
![Page 48: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/48.jpg)
Other Issues in the Cluster World
• Preprocess - Create an adaptive BSP tree with small voxels in detailed regions and large voxels in sparse regions, O(n log n)
• Load balancing – assign voxels to machines that have already done them, only good for small scenes
• Interconnect – Gigabit ethernet and switch– is this fair?
![Page 49: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/49.jpg)
Results for the Cluster Ray Tracer
• On a 12.5 million tri. power plant model, 3-5 fps almost constantly (8-10 fps with SIMD instructions), comparable to rasterizer
• Adding reflective rays: the performance hit is proportional to # of traced rays, reduced coherence -> little effect on performance (!)
• Stress test of a 4x Power Plant (50 million tri’s) found that indoor scenes were not affected (2 extra BSP tree levels), while outside scenes with motion suffered from large voxel transfer over network
![Page 50: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/50.jpg)
Network Saturation vs. Scalability
![Page 51: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/51.jpg)
Hardware Support in the Future
• RAYA – simulations say build a ray tracer on a single chip
• Smart Memories – a programmable and configurable architecture, should be able to get 50 fps at 512 x 512 (!)
• Saarland’s own architecture – a ray tracing pipeline, coherence is enforced by having rays traversal and intersection on a clock
![Page 52: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/52.jpg)
Smart Memories
![Page 53: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/53.jpg)
Saarland’s Pipeline
![Page 54: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/54.jpg)
Ongoing Ray Tracing Research
• Dynamic scenes– some work done, Reinhard proposes making
large objects live in coarser levels of the hierarchy to maintain constant update cost
• Ray tracing API– try to make it like OpenGL, like C and Java
• Interactive Global Illumination– more like an application of ray tracing
![Page 55: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/55.jpg)
Reinhard on dynamic hierarchies
![Page 56: Interactive Ray Tracing #2 Peter Djeu April 22, 2003](https://reader035.vdocuments.mx/reader035/viewer/2022062717/56649e205503460f94b0c2ca/html5/thumbnails/56.jpg)
Parting Thoughts
• Ray tracing and rasterization are, in a way, converging– Occlusion culling, hierarchical z-buffer, advanced
shading
• Still different in that ray tracing selects only the geometry needed, while rasterization needs to conservatively send all tri’s that might be visible
• “We strongly believe that what we see today is only the beginning of an exciting new field of computer graphics.”