IRW: An Incremental Representation
for Image-Based Walkthroughs

Overview

IRW is a new representation for interactive image-based walkthroughs. The target applications reconstruct a scene from novel viewpoints using samples from a spatial image dataset collected from a plane at eye-level. The datasets pair images with camera pose information and are often extremely large in size. Our representation exploits spatial coherence and rearranges the input samples as epipolar images. The base unit corresponds to a column of the original image that can be individually addressed and accessed. The overall representation, IRW, supports incremental updates, efficient encoding, scalable performance, and selective inclusion used by different reconstruction algorithms.

The IRW web page is divided up into the following sections. Click on a section heading to jump to the corresponding section.


Research Team

This project is one portion of the research being conducted by the Spatial Encoding Research Group, or SERG. The following people contributed directly to this work: Back to the top...


Publications

David Gotz, Ketan Mayer-Patel, Dinesh Manocha. IRW: An Incremental Representation for Image-Based Walkthroughs. To appear in ACM Multimedia 2002, Juan-les-Pins, France (2002).
[Acrobat, 1,721k]

Back to the top...


Overview of IRW

This section is designed to provide you with a general overview of IRW. We first describe Epipolar Plane Images, a core concept to the IRW design. Next, we present the design goals for IRW. Finally, we present results from some experiments we have conducted. For a detailed description of the representation itself, read our paper. This overview is broken down into three main sections:

Epipolar Geometry and Epipolar Plane Images

    Epipolar geometry describes the relationship between a pair of perspective cameras. Figure 1 shows the relationship between a camera at point A1 and a camera at point A2. The rectangular area around each camera is the projection plane. Each camera observes point G. For cameras A1 and A2, point G projects to g1 and g2 respectively. The line connecting the two cameras, or the baseline, is defined by A1 and A2. This line intersects the cameras' projection planes at e1 and e2, which are called epipoles. The plane defined by the triangle A1A2G is the epipolar plane. In Figure 1, the triangle is shown in orange. The intersections of the epipolar plane with the cameras' projection planes are known as epipolar lines.


Figure 1: Epipolar Geometry For Two Cameras

As described above, epipolar geometry applies to a pair of perspective cameras. The same geometric relationship exists between a pair of images taken from a single camera when that camera is moved between images. When a camera moves linearly in the camera's view direction, more observations can be made. Figure 2 shows three such image frames, captured from a camera undergoing linear motion in the view direction. Note that the three epipoles (e1, e2, and e3) are all collinear. In addition, all three projections of G (g1, g2, and g3) all lie within the epipolar plane, as do the three epipolar lines.


Figure 2: Epipolar Geometry for Linear Camera Motion

For a more concrete example, Figure 3 shows a sequence of images taken under linear camera motion. The camera is moving across a virtual scene in the view direction. For all frames in the sequence, the epipoles are collinear.


Figure 3: Video Sequence of Linear Camera Motion

Given a sequence of images with linear camera motion, interesting images can be formed by taking the center column from all frames of the sequence and arranging them in sorted order to form another image. This new image is known as an epipolar plane image, or EPI. Figure 4 shows the first frame from the sequence above and a red box around the center column. This column is the leftmost column in the EPI shown to its right. The rest of the EPI is made up from the center columns of the remaining frames of the sequence.


Figure 4: Building an EPI from a Video Sequence

The structure apparent in the EPI in Figure 4 is due to the perspective projection of the camera. Regions of the EPI appear as objects become visible and disappear as the become occluded. EPIs have been used extensively in the computer vision community for structure from motion computations. These EPIs will be a core foundation for the IRW representation.

Back to the start of the IRW Overview section...

The IRW Design Goals

    In this section, we present our design goals. The details of the design itself are best described in the paper and are not included on this web page.

Input Data

IRW is designed to store a collection of image sampled captured as omnidirectional panoramas taken from a plane in space. Each panorama is taken from a plane at eye level and has an assocaited pose estimate for the camera position. Figure 6 shows two such panoramas: one synthetic and one real.


Figure 6: Panoramic Images

Design Goals

The IRW representation is designed to support a number of design goals. These goals include:

  • Efficient Encoding: compression rates comparable to previous methods or better.
  • Incremental: new samples should be easily added without major recoding or reorganization of existing data.
  • Fine-grained Access: because applications generate reconstructions from small parts of many images, the representation must support fine-grained access to sub-regions of the images.
  • Fine-grained Selective Inclusion: the representation should ignore sub-regions of the images that provide little or no reconstructive capability.
To support these design goals, we developed IRW, an Incremental Representation for Image-Based Walkthroughs. A detailed description of IRW can be found in our paper.

Back to the start of the IRW Overview section...

Experimental Results

    The IRW representaiton exhibits a number of desireable properties. These include sublinear growth, adjustable compression rates and quality levels, selective inclusion, and an increasing compression ratio as the data set grows in size. These properties, in addition to the incremental nature of the representation, make IRW an effective representation for image-based walkthroughs.

One key property of IRW is that it has a sublinear growth rate. As more columns are added, the size of the database grows more and more slowly. In Figure 8, the growth rate is shown by the solid curve. The dotted line is drawn to highlight the linear rate assocaited with the first few columns. The size of the database clearly trends below the dotted line as more columns are added.


Figure 8: Sublinear Growth Rate

The sublinear growth rate is explained by what happens to new columns as they are added to the database. When the database is empty, the first row is by necessity an index column. As Figure 9 shows, the probability that a new column is encoded as an index column decreases as more columns are added. Because index columns are typically far larger on disk than difference columns, the implication of this behavior is sublinear growth.


Figure 9: Distribution of Index Columns

Like JPEG and other compression techniques, IRW provides methods for adjusting the compression rate and quality level. Quality can be easily adjusted in two ways. First, the coefficients returned by the wavelet transform are quantized to a set number of bits. Increasing the number of bits will result in higher quality but lower compression rates. Quality can also be adjusted by clamping all low coefficients to zero. We used this second mathod during an experiment and the results are shown in Figure 10. As the figure shows, increasing the threshold at which we clamp to zero reduces the image quality (as measured by peak signal-to-noise).


Figure 10: PSNR vs. Zero-Cutoff Threshold

Figure 11 shows the results of an experiment designed to evaluate selective inclusion. The probability that a column is added to the database decreases as the database size grows. This is the behavior that we both expected and desired. When the database is small, new columns are very useful in increasing reconstrution quality. However, as the database grows, additional columns are less useful: the marginal utility decreases. At some point, columns supply so little additional information, they are no longer worth storing in the database at all. The exact behavior depends both on the reconstruction algorithm (without which we can't decide what is 'useful') and an error tolerance. Figure 11 shows three different error tolerances, all using the same reconstruction algorithm.


Figure 11: Probability That Column is Added vs. Number
of Columns. (a) Low tolerance for reconstruction error
(b) Moderate tolerance for reconstruction error
(c) High tolerance for reconstruction error

The benefit of selective inclusion is improved compression. No longer are marginally useful samples stored in the database. Figure 12 shows this behavior. The three curves correspond to the same 3 error thresholds as Figure 11.


Figure 12: Compression Ratio As Number of Columns Increases.
(a) Low tolerance for reconstruction error
(b) Moderate tolerance for reconstruction error
(c) High tolerance for reconstruction error

Finally, Figure 13 shows a table of compressed file sizes to compare the behavior of IRW with that of traditional JPEG compression. As the statistics show, IRW provides comparable compression rates to JPEG while still supporting added benefits such as incremental updates, sublinear growth, and fine-grained data access.

Real Scene Using JPEG
Quality8084889296100
Size (KB)11.513.917.221.634.063.7
PSNR44.9945.5246.2347.0548.4050.34

Real Scene Using IRW
Quality765432
Size (KB)17.821.426.633.945.661.4
PSNR37.2938.0539.0540.1041.1843.15

Synthetic Scene Using JPEG
Quality8084889296100
Size (KB)12.914.817.521.631.856.0
PSNR36.2935.8336.2136.6436.5836.61

Synthetic Scene Using IRW
Quality765432
Size (KB)45.550.052.254.257.684.2
PSNR38.5939.3940.2641.4043.1745.01

Figure 13: Compression Rates: IRW vs. JPEG

Back to the start of the IRW Overview section...

Back to the top...


Internal Project Resources

The IRW project maintains a library of information for use by project members. To access this portion of the web page, you must either be on the UNC-CH Computer Science network or have a valid login and password. To view these resources, click here.
Send Questions/Comments to David Gotz, gotz AT cs.unc.edu