The 3D Image warping algorithm proposed by McMillan and Bishop uses regular single-layered depth images (which are called reference images) as the initial input. One of the major problems of 3D image warping is the disocclusion artifacts which are caused by the areas that are occluded in the original reference image but visible in the current view. Those artifacts appear as tears or gaps in the output image. The following figure shows an example of such disocclusion artifacts (in blue color) when the viewer deviates from the original position where the reference images were taken:
The fundamental problem of the disocclusion artifacts is that the information of the previously occluded area is missing in the reference image. By using multiple reference images taken from different viewpoints, the disocclusion artifacts can be reduced because an area that is not visible at one view may be visible at another. When multiple source images are available, we expect the disocclusion artifacts that occur while warping one reference image to be eliminated by one of the other reference images. However, combining multiple reference images and eliminating the redundant information is a non-trivial problem.
Recently, the Layered Depth Image (LDI) was proposed by Shade et al. to merge many reference images under a single center of projection. It tackles the occlusion problems by keeping multiple depth pixels per pixel location, while still maintaining the simplicity of warping a single reference image. Its limitation is that the fixed resolution of the LDI may not provide adequate sampling rate for every reference image.
The LDI Tree combines a hierarchical space partition scheme with the concept of LDI. It preserves the sampling rate of the reference images by adaptively selecting an LDI in the LDI tree for each pixel. While rendering from the LDI tree, we only have to traverse the LDI tree to the levels that are comparable to the sampling rate of the output image. Because each LDI also contains pre-filtered results from its children LDIs, the progressive refinement feature is easy to implement. The pre-filtering also enables a new "gap filling" algorithm to fill the disocclusion artifacts that cannot be resolved by any reference image. The amount of memory required has the same order of growth as the 2D reference images. Furthermore, the rendering time is almost independent of the number of reference images (except by the extra detail the new reference images may provide). Therefore the LDI tree preserves an important feature that the image-based rendering has over traditional polygon-based rendering: the cost is bounded by the complexity of the reference images, not by the complexity of the scene.
The next two figures shows the results after combining 36 reference images taken from 9 different positions. The left image does not have the gap filling feature enabled, while the right image does.
The current implementation on SGI workstations with 250MHz MIPS R10000 processors takes about 6 seconds to produce output images of 320x320 size when 100 reference images are used. The processing is done completely in the host CPU. We plan to exploit the graphics pipeline and the multiprocessor features on the SGI Onyx2 to achieve near real-time rendering (i.e. multiple frames per second).
Download the paper (in PDF format, about 1.04MB). (In the proceedings of SIGGRAPH 99)
Maintained by Chun-Fa Chang
Last Modified 17 February 1999