UNC-Chapel Hill COMP 790-58 GPGPU



3D Occlusion Inference from Multi-camera Videos

- Acceleration and Comparison

Li Guan


1. Previous Work

We consider the problem of detecting and accounting for the presence of occluders in a 3D scene based on silhouette cues in video streams obtained from multiple, calibrated views. We have shown that static occluders in the interaction space of dynamic objects can be detected and their 3D shape fully recovered from pure dynamic object motion. Occluder information can then be used for online correction and consolidation of dynamic object shape. More details.

We use volumetric representation. So far, the CPU version takes about 1 min. to compute 1 set of 9 cameras frames for a 128 by 128 by 128 volume. Normally for each camera, thousands of video frames have to be processed to achieve the result as shown in the right colume of the above figure. The ultimate goal is to make this process realtime, so acceleration is a must. The advantage of utilizing the Graphics hardware is the volumetric compuation is easy for parallelization, with which we hope to get a great speedup. But the challenges lie in the following areas,
(1) Peak findings along each viewing ray for every voxel, which is computationally intensive and hard for parallelization.
(2) Probability accumulation through camera views and time frames, which requires extra care of the precision we are able to get.

2. Goal

Analysis the CPU version and specify the parts that can benefit from GPU acceleration. Using GLSL on ATI X1400 for evaluation.

3. CPU Version Analysis

From the above chart we know:
(1) the CPU version is bounded by O(fcN), where f is the number of frames, c is the number of cameras, N is the number of voxels in the grid.
(2) the most time-consuming process is the peak-finding in the Occluder grid computation step, which takes O(3cN) time complexity for every time step. And since we have already used some acceleration approach such as divide-and-conquer to achieve this speed, we need more delicate attention on peak-finding in the GPU version.
(3) most of the computations are on the voxels, which makes GPU parallelization feasible. However, the computation of O grid depends on the completion of F grid computation. This means the program needs to stall waiting for F computation before running the O computation. It is also the case for the peak-finding.
(4) it is unlikely that all temporary volumes can be stored in memory, which means we may need to re-design the data flow for GPU implementation.

4. Data Flow

Foreground Volume Inference:

Foreground Peak Finding:
To be updated...

5. Current Status

Feb 6th Proposal and webpage set up.

End of March, analysis of the CPU version, propose stages that are good for GPU acceleration, and which are not. Main GPU framework set up.

Middle of April, first version.

End of April, testing, evaluation results, and final report.