extreme multi-resolution visualization: a challenge on...

Extreme Multi-Resolution Visualization:

A Challenge on Many Levels

Joanna Balme, Eric Brown-Dymkoski, Victor Guerrero, Stephen Jones, Andre Kessler,Adam Lichtl, Kevin Lung, William Moses, Ken Museth, Nathan Roberson, Danny Taller,

and Tom Fogal (NVIDIA)

SpaceX

August 7, 2015

1 Introduction and Motivation

Accurate simulation of turbulent flows is a persis-tent problem, marked by the cost of directly resolvingchaotic interactions between widely disparate scales.Energies interact between the largest bulk phenom-ena – the full aircraft, thrust chamber, or device ofinterest – down to the minute lengths where dissi-pative forces dominate. Applied cases easily encom-pass length and time scales ranging over six orders ofmagnitude for industrial combustors. Bulk statisti-cal quantities can often suffice for design and analy-sis of some applications, such as aerodynamics; how-ever, in rocket engines, mixing on the smallest scalesdrives combustion and is crucial for capturing thelarge scale dynamics which characterize stability andperformance. In these cases, current computationalapproaches still fall short for predictive analysis: thedifficulties in simultaneously capturing the large andsmall scales, and their interactions, significantly de-grades capability and limits the utility of simulationin engineering design cycles.

Multiresolution wavelet analysis is an emergingmethod for dynamically adaptive simulation. Espe-cially applicable to turbulent flows, it has the abilityto efficiently capture wide scale ranges at high accu-racy by using multidimensional signal compression toidentify intermittent coherent structures. A many-core algorithm was implemented targeting graphics

processing units (GPUs) in order to take full advan-tage of developments in single instruction, multiplethread (SIMT) parallelism. In particular, GPUs of-fer high memory bandwidth that can be exploitedby programs with well-coalesced access. Complimen-tary to this, an efficient algorithm was developed formapping the implicit wavelet topologies for visualiza-tion. This system serves both polygonal representa-tion approaches and a common infrastructure to in-terface with movie-quality rendering and animationpackages.

2 Adaptive Wavelet Simulation

Adaptive wavelet collocation is a general finite-difference framework for solving PDEs on an opti-mal, dynamic grid [2]. It permits high order, non-dissipative finite differencing and high order inter-polation. Resolution levels are defined by a seriesof dyadic nested grids, with lower levels being thecoarser. An interpolating wavelet transform iden-tifies information-carrying grid nodes at each level,such that by identifying only these significant nodesand discarding the remainder the system maintains asparse data representation as it evolves through time.

Figure 1 shows how a mesh adapts around a shear-flow vortex, in this case using 14 resolution levels tocapture the finest structures. The resulting grid con-

1

Figure 1: Periodic shear flow simulation (Re =80, 000, 14 levels of resolution), colored by fluid den-sity and showing wavelet mesh refinement.

sists only of nodes that are required to reconstructthe global flow field at an error level which can bespecified a priori. For homogeneous turbulent re-gions, data compression greater than 95−99% can beachieved while resolving the very small, intermittenteddies and motions with less than 1% error [3].

These grids pose several challenges for high-performance computing: the data layout evolvescontinuously throughout the course of a simulation,and the complicated topology of orthogonal, multi-resolution stencils frustrates coalesced memory ac-cess. While the existence of stencil support is guaran-teed by the wavelet transform, locating it in memoryis non-trivial due to the adaptive, multi-level con-struction.

Taking advantage of the property that memory ac-cess patterns are not random, this algorithm avoidsthe need for auxiliary topological structures. Instead,it uses a series of permutations to optimize the gridfor streaming access. For each step, the subset ofcompute nodes for a given level are extracted fromthe full grid along with all nodes from coarser reso-lutions. By sorting along each dimensional direction,the fact that stencil locations are well defined by in-teger strides on this subset means that the memory

can be traversed in a parallel-friendly fashion withthreads operating on consecutive memory addresses.The GPU implementation employs scalable, O(N)parallel partitioning and radix sorts for efficient datarearrangement.

3 Sparse Dataset Visualization

Sparse data representations are also crucial for effi-cient and high fidelity visualization at extreme levelsof detail. Unfortunately, the wavelet grid is ill-suitedfor this purpose. Lack of explicit connectivity infor-mation, while a major computational advantage forsimulation, complicates all but trivial point render-ing techniques. Many visualization approaches re-quire the topological information of the mesh to beknown, since it facilitates efficient interpolation ker-nels for ray-integration and volume rendering. Sparseoctrees, with fully populated leaf nodes, are a com-mon representation that naturally defines this con-nectivity.

Although they have been derived independentlyand for differing purposes, octrees and wavelet gridsare intrinsically aligned. Both feature a level-by-leveldyadic construction that terminates at a well definedroot, and our algorithm exploits this topological sim-ilarity to efficiently construct a forest of octrees fromthe simulation-optimized wavelet grid.

Initially, the process builds a VDB tree [1] fromthe adaptive wavelet structure. While OpenVDB of-fers constant-time (O(1)) threaded random access it-erators for sparsely sampled domains, we primarilyuse this VDB tree as an intermediate accelerationstructure to index and query point-neighborhood in-formation. As wavelet transform stencils ensure thatinternal branch nodes in the octree exist, top-downtree building is efficient and allows for early termi-nation of searches. Nodes added in order to fullypopulate octree leaves have their accompanying datafilled in using tri-linear interpolation–though low or-der, this is fast and ensures boundedness of the sim-ulation data. The wavelet transform and finite dif-ference stencil support present a topology such thatminimal nodes need to be added, minimizing artifactsand maintaining the integrity of the visualization.

2

Figure 2: Vortices develop at the interface be-tween counter-rotating flows, simulated on a multi-resolution grid using 17 levels of detail. Retainingthe sparse structure during rendering maintains aneffective compression ratio of 500x and allows eachtimestep to be converted and rendered in less thanone second.

The resulting forest of octrees, once constructed,enable two separate visualization approaches: polyg-onalization for rasterizing, or volumetric renderingand ray-tracing.

Figure 2 shows a multi-resolution rendering ofpolygonalized data from a simulation spanning 17levels of detail, where the smallest feature is just 15microns within an overall domain spanning 2 meters.Direct rendering of the original point cloud takesmore than 60 seconds per frame using one commonvisualization tool; by comparison, the VDB-basedsparse-data algorithm developed here both generatestopology and renders the scene in less than one sec-ond.

Volumetric rendering and ray-tracing may be per-formed either in-place, or by populating a second

sparse VDB representation. This is distinct fromthe VDB tree that is used previously in tree con-struction: it is a self-compressed structure, employ-ing a controlled-fidelity approach similar to wavelets,and populating interpolated voxel information fromthe octrees. This derived representation is nativelysupported by, and allows direct interfacing with,most major movie-quality production and anima-tion packages, including HoudiniTM, RenderManTM,ArnoldTM and NVIDIA OptiXTM.

4 Professional & Production-Quality Visualization

Along with the benefits of leveraging the technology,taking a movie-industry approach to rendering rein-troduces a palpable element often lost in the morecommon scientific visualization techniques. It doesnot replace such post-processing, but rather is com-plimentary, offering an intuitive view. It enablesa very human and tangible approach that is moreprevalent in experimental physics.

The visualization shown in Figure 3 was createdusing NVIDIA OptiX. OptiX is a ray tracing frame-work for CPUs and GPUs that greatly simplifies theprocess of writing custom ray tracers and shaders.OptiX uses bounding volume hierarchies internallyand groups similar rays automatically, allowing thedeveloper to focus on using ray tracing to describetheir scene and interact with their data. It is an ex-cellent fit for scientific visualization as one can definethe rendering in the best way to communicate thedata’s features. This can use the physical propertiesof light, or modified forms, such as virtual Schlierenimaging.

The OptiX renderer is similar to ray-guided ren-derers introduced in recent years [5, 6, 7]. It usesray marching, which is considerably faster than themarching cubes alternative, to identify an isosurfaceof interest. The data from this simulation was con-verted into VDB format as previously outlined. SinceOpenVDB was designed for a CPU, some modifica-tions were made to use the computational power ofa GPU to achieve real-time rendering with advanced

3

Figure 3: Ray-traced image from NVIDIA OptiXTM,using refraction to visualize wake formation arounda capsule during late re-entry (Mach = 0.33)

ray-tracing effects. This involved expanding VDB’sinternal tile and brick sizes by a factor of 64 to im-prove GPU access times. The data structure con-version takes about 2 seconds per timestep for a 120megabyte simulation file.

OptiX has allowed experimentation with more ad-vanced rendering techniques. Domain scientists areinterested in validating the multi-resolution structureof their grid, but the sharp discontinuities can bejarring in communicative visualization. The visual-ization of Figure 3 uses refraction and reflection todetail fluid structure as a glassy material. This de-emphasizes the ‘blocky’ discontinuities that can behighlighted with typical Phong shading, although ittends to drown out data at fine scales. Naive viewingthe data strictly as an isosurface hides the ‘core’ ofthe shockwave and the structure that occurs insidethe surface itself. The solution is to do restrictedray marching at the surface intersection point, ac-cumulating color using a standard volume renderingapproach. The volume rendering is blended with thecolor information from the obscuring isosurface com-putation. This unique method of rendering leadsto an interesting focus+context style of visualiza-tion, with the volume rendering emphasizing fine-scale features while the semi-transparent isosurfacegives large-scale context information.

5 Conclusion

Predictive simulations at the scale and complexity re-quired by rocket combustor physics has prompted thedevelopment of a high-performance, adaptive meth-ods. This approach exploits the similarity of the sim-ulation and visualization sparse data representationsto maintain a compressed form throughout the CFDpipeline.

References

[1] Museth, K.: VDB: High-Resolution Sparse Vol-umes with Dynamic Topology, ACM Transac-tions on Graphics 32(3), 2013. Presented at SIG-GRAPH 2013.

[2] Vasilyev, O.V., Bowman, C Second Genera-tion Wavelet Collocation Method for the Solutionof Partial Differential Equations, J. Comp. Phys.,165, 660-693 (2000)

[3] Nejadmalayeri, A., Vezolainen A., Vasi-lyev, O. V. Reynolds Number Scaling of CVSand SCALES, Physics of Fluids, 25, 110823, 660-693 (2013)

[4] Nejadmalayeri, A., Vezolainen A., Brown-Dymkoski, E., Vasilyev, O. V. Parallel Adap-tive Wavelet Collocation Method for PDEs, J.Comp. Phys., 298, 237-253 (2015)

[5] Crassin, C., Fabrice, N., Lefebvre, S.,Eisemann, E Gigavoxels: Ray-Guided Stream-ing For Efficient And Detailed Voxel Rendering,ACM SIGGRAPH Symposium on Interactive 3DGraphics and Games (I3D) (2009)

[6] Hadwiger, M., Beyer, J., Jeong, K., Pfis-ter, H. Interactive Volume Exploration OfPetascale Microscopy Data Streams Using AVisualization-Driven Virtual Memory Approach,Proceedings of IEEE Visualization 2012

[7] Fogal, T., Schiewe, A., Krueger, J. AnAnalysis of Scalable GPU-Based Ray-Guided Vol-ume Rendering, Large Data Analysis and Visual-ization, IEEE Visualization (2013)

4

extreme multi-resolution visualization: a challenge on...

Documents