Hierarchical Z-Buffer Occlusion Culling - Generating Occlusion Volumes

Almost a year ago I wrote a couple of posts on Hierarchical Z-Buffer Occlusion Culling (HZB-OC).

One very glaring issue I left largely untouched was the workflow issue.  Artists currently have to author all or a large percentage of the volumes by hand in the implementations I’m aware exist.  Ideally these simplified occlusion volumes would be generated by a tool in the art pipeline instead.

A few weeks ago I started looking for a new side project and decided to try and flesh-out a very rough idea I had for automatically generating occlusion volumes.  This is still a work in progress but I wanted to share my current progress because I think it has some value even at this early stage.

The Problem

An ideal occlusion volume has some important features,

  1. Conservativeness – Doesn’t extend beyond the surface of the mesh
  2. Simplicity – The occlusion volume is made of very few triangles or is fast to render
  3. Volume Conservation – Closely matches the original mesh’s volume
  4. Dynamic – Some games have large movable occluders or destroyable walls

Normal methods of simplifying a mesh such as typical triangle simplification causes problems in both the area of conservativeness and volume conservation.  In some cases you can use the physics collision mesh if available.  One thing to be aware of is that when the physics volume is larger than the visual mesh it can cause false occlusions leading to things popping into view.  Also not every mesh needs a physics volume nor are physics volumes always meshes.  Will Vale’s talk from GDC 2011 covered using physics collision meshes pretty thoroughly for HZB-OC, you should check it out.

The Technique

Let me start by summarizing the technique I’ve been developing for generating the occlusion volumes before I go in-depth into each step in the process.

  1. Find all the voxels completely inside a mesh
  2. Find the voxel at the densest point in the volume
  3. Expand a box from this point until all sides collide with the mesh surface or another box
  4. Repeat 2-3 until you’ve consumed X% of the total volume
  5. Filter small or useless boxes (Unimplemented)
  6. Use a Constructive Solid Geometry (CSG) algorithm to merge the boxes you create

1. Voxelization

First you have to determine all the voxels completely inside the mesh.  That way we can have complete confidence that anything we generate contained inside these voxels will be conservative.  It also gives us a very easy way of quantifying the total volume and the volume remaining.

The voxelization algorithms I ended up using comes from this paper Fast Parallel Surface and Solid Voxelization on GPUs [Schwarz 2010].  I used the 3.1 section for surface voxelization and the 4.1 section for solid voxelization.  Both are required because a voxel is considered part of the solid volume if the center of the voxel is inside the mesh.  We want the entire voxel to be inside the mesh though.  So you have to remove the excess voxels from the solid set using the surface set.

My first implementation was in C++ and entirely on the CPU which took about 20 seconds to run on a mesh with ~10,000 triangles and a 503 voxel volume.  In the version I’m presenting here I moved to C#, when I did that time went up to about 60 seconds.  So I ended up porting a similar version of the CPU implementation to the GPU using OpenCL (Cloo), on the same data it ran in about 6 seconds on my Nvidia 540M, with no attempt to optimize.

One unfortunate limitation of Schwarz’s solid voxelization algorithm it that it requires watertight meshes.  The reason for this is you’re shooting a ray down a column of voxels until you intersect with a triangle.  You’re then xor’ing the bit for each voxel signaling that this column of voxel’s is inside or outside.  Voxels that are inside a mesh will be Xor’ed an odd number of times, where as outside voxels will be xored an even number of times.  So if there is a hole in the mesh, you’d leave a incomplete trail of xor’s in that column, leaving every voxel from the last triangle to the bounding box marked as ‘inside’.

Two papers I’ve run across that might contain solutions to this problem are An Automatic Hole-Filling Algorithm for Polygon Meshes and Complete Polygonal Scene Voxelization.

2. Find The Highest Density Voxel

In this step you’ll find the voxel that is furthest away from any external area, and excluding any voxel that has already been enclosed by a box in step 3.  Because the number of empty voxels likely far exceeds the number of voxels touching an empty voxel, you’ll probably want to test against that list instead of determining the distance from the closest empty voxel.

For my project this was implemented on the CPU, but this is probably something you could do faster on the GPU if this ever became too slow for your needs.

3. Box Expansion

Once you’ve found the densest voxel you’re going to create a 13 voxel sized box at that location.  You’ll then proceed to iteratively expand each side of the box in voxel space until you can’t expand any side of the box without entering an empty voxel or another box that has already been placed.

As you verify each expansion of the box you’ll mark the enclosed voxels in such a way so that the next time you choose the densest voxel you can exclude any already enclosed in a box.

Currently the expansion order is uniform, however, I suspect a better approach would be some small amount of prediction allowing me to grow one side more than another to maximize the potential volume of each box. 

The best implementation of this AABB expansion methodology would be to use an optimization solver, but that could potentially take quite awhile to run.

I’m still looking for alternatives to the AABB box expansion method of generating the simplified geometry. The axis aligned boxes aren’t the best for models at non-right angles.  Perhaps a better approach could be divined with OBBs but that’s just conjecture.

I briefly considered the Iso-Surface method Xi Wang mentioned in his Automated Level of Detail Generation for HALO: REACH talk at GDC 2011 but that generates a lot of triangles and a volume larger than the voxel volume you start with. So that just gets us right back to the problem we started with.

4. Repeat 2-3

I ended up having two different stopping conditions.  If an absolute percentage of the remaining volume is met I stop adding new boxes.  Alternatively if the last box created consumed too small a percentage of the total volume that also stops the box expansion process.

Some other things I didn’t implement but that could be tried are, setting a maximum number of boxes, or a box too small (in voxel space, instead of percentage) as a cutoff point.

5. Filter Boxes

This portion of the technique I haven’t tried but makes a lot of sense depending on your stopping condition.  If several small boxes are generated before one of the cutoff conditions is met during step 4, you’ll probably want to ignore them before moving to step 6 to reduce your occluder triangle count.

6. Merge Boxes With Constructive Solid Geometry

Drawing individual boxes even as a single draw call would cause lots of overdraw.  Instead we should take our collection of boxes and combine them into a single mesh before we finish.  To do this we’re going to use Constructive Solid Geometry (CSG).

Soon after I realized I was going to need CSG (or something akin to it) to solve this problem I was informed of a new book, Game Development Tools.  In that book there is an implementation for Real-Time Constructive Solid Geometry written by Sander van Rossen and Matthew Baranowski.  The implementation is pretty slick and I was able to get it up and running pretty quickly.

The Results: Time-Lapse

The box expansion runs in a fraction of a second, but I slowed it down and broke it up over several frames so that you could see how the occlusion shape evolves.

The Results: Different Models

To get an idea of the results on different models I’ve provided some screenshots.  The settings I used for these was to cutoff after 90% of the volume had been consumed by boxes.

Notes

In situations where the technique doesn’t work very well allow an artist to specify a custom mesh.  In some cases an option to use the same mesh for both the visual and the occluder would probably be a good idea (like for planes).

To reduce draw calls you will likely want to generate the occluders during export time.  Then during the cook stage in your level editor, merge some number of the occluders together to reduce draw calls.

Download

Download Voxelization Occlusion Generator License: MIT

Thanks

Thanks to Michael Noland and Shaun Kime for letting me bounce ideas off them.  Also thanks to Stephen Hill for telling me about Rossen / Baranowski’s Constructive Solid Geometry implementation and for giving me feedback on this post.