Overview
IRW is a new representation for interactive image-based walkthroughs. The
target applications reconstruct a scene from novel viewpoints using samples
from a spatial image dataset collected from a plane at eye-level. The
datasets pair images with camera pose information and are often extremely
large in size. Our representation exploits spatial coherence and
rearranges the input samples as epipolar images. The base unit corresponds
to a column of the original image that can be individually addressed and
accessed. The overall representation, IRW, supports incremental updates,
efficient encoding, scalable performance, and selective inclusion used by
different reconstruction algorithms.
The IRW web page is divided up into the following sections. Click on a
section heading to jump to the corresponding section.
Research Team
This project is one portion of the research being conducted by the
Spatial Encoding Research Group, or SERG. The following people contributed directly to
this work:
Back to the top...
Publications
David Gotz, Ketan Mayer-Patel, Dinesh Manocha. IRW:
An Incremental Representation for Image-Based Walkthroughs. To
appear in ACM Multimedia 2002, Juan-les-Pins, France (2002).
[Acrobat, 1,721k]
Back to the top...
Overview of IRW
This section is designed to provide you with a general overview of IRW.
We first describe Epipolar Plane Images, a core concept to the IRW design.
Next, we present the design goals for IRW. Finally, we present results
from some experiments we have conducted. For a detailed description of
the representation itself, read our paper. This overview is broken
down into three main sections:
|
|     |
Epipolar geometry describes the relationship between a pair of perspective
cameras. Figure 1 shows the relationship between a camera at point
A1 and a camera at point A2. The
rectangular area around each camera is the projection plane. Each camera
observes point G. For cameras
A1 and A2, point G
projects to g1 and g2 respectively.
The line connecting the two cameras, or the baseline, is defined by
A1 and A2. This line intersects the
cameras' projection planes at e1 and e2,
which are called epipoles. The plane defined by the triangle
A1A2G is the epipolar plane. In Figure
1, the triangle is shown in orange. The intersections of the epipolar plane
with the cameras' projection planes are known as epipolar lines.
Figure 1: Epipolar Geometry For Two Cameras
|
As described above, epipolar geometry applies to a pair of perspective
cameras. The same geometric relationship exists between a pair of images
taken from a single camera when that camera is moved between images. When
a camera moves linearly in the camera's view direction, more observations
can be made. Figure 2 shows three such image frames, captured from a
camera undergoing linear motion in the view direction. Note that the three
epipoles (e1, e2, and
e3) are all collinear. In addition, all three projections
of G (g1, g2, and
g3) all lie within the epipolar plane, as do the three
epipolar lines.
Figure 2: Epipolar Geometry for Linear Camera Motion
|
For a more concrete example, Figure 3 shows a sequence of images taken under
linear camera motion. The camera is moving across a virtual scene in the view
direction. For all frames in the sequence, the epipoles are collinear.
Figure 3: Video Sequence of Linear Camera Motion
|
Given a sequence of images with linear camera motion, interesting images can
be formed by taking the center column from all frames of the sequence and
arranging them in sorted order to form another image. This new image is
known as an epipolar plane image, or EPI. Figure 4 shows the first
frame from the sequence above and a red box around the center column. This
column is the leftmost column in the EPI shown to its right. The rest of the
EPI is made up from the center columns of the remaining frames of the
sequence.
Figure 4: Building an EPI from a Video Sequence
|
The structure apparent in the EPI in Figure 4 is due to the perspective
projection of the camera. Regions of the EPI appear as objects become
visible and disappear as the become occluded. EPIs have been used
extensively in the computer vision community for structure from motion
computations. These EPIs will be a core foundation for the IRW
representation.
Back to the
start of the IRW Overview section...
|
|
|     |
In this section, we present our design goals. The details of the design
itself are best described in the paper and are not included on this
web page.
Input Data
IRW is designed to store a collection of image sampled captured as
omnidirectional panoramas taken from a plane in space. Each panorama is
taken from a plane at eye level and has an assocaited pose estimate for
the camera position. Figure 6 shows two such panoramas: one synthetic and
one real.
Figure 6: Panoramic Images
Design Goals
The IRW representation is designed to support a number of design goals.
These goals include:
- Efficient Encoding: compression rates comparable to previous
methods or better.
- Incremental: new samples should be easily added without major
recoding or reorganization of existing data.
- Fine-grained Access: because applications generate
reconstructions from small parts of many images, the representation must
support fine-grained access to sub-regions of the images.
- Fine-grained Selective Inclusion: the representation should
ignore sub-regions of the images that provide little or no reconstructive
capability.
To support these design goals, we developed IRW, an Incremental
Representation for Image-Based Walkthroughs. A detailed description of
IRW can be found in our paper.
Back to the
start of the IRW Overview section...
|
|
|     |
The IRW representaiton exhibits a number of desireable properties. These
include sublinear growth, adjustable compression rates and quality levels,
selective inclusion, and an increasing compression ratio as the data set
grows in size. These properties, in addition to the incremental nature of
the representation, make IRW an effective representation for image-based
walkthroughs.
One key property of IRW is that it has a sublinear growth rate. As more
columns are added, the size of the database grows more and more slowly. In
Figure 8, the growth rate is shown by the solid curve. The dotted line is
drawn to highlight the linear rate assocaited with the first few columns.
The size of the database clearly trends below the dotted line as more
columns are added.
Figure 8: Sublinear Growth Rate
|
The sublinear growth rate is explained by what happens to new columns as
they are added to the database. When the database is empty, the first row
is by necessity an index column. As Figure 9 shows, the probability that a
new column is encoded as an index column decreases as more columns are
added. Because index columns are typically far larger on disk than
difference columns, the implication of this behavior is sublinear growth.
Figure 9: Distribution of Index Columns
|
Like JPEG and other compression techniques, IRW provides methods for
adjusting the compression rate and quality level. Quality can be easily
adjusted in two ways. First, the coefficients returned by the wavelet
transform are quantized to a set number of bits. Increasing the number of
bits will result in higher quality but lower compression rates. Quality
can also be adjusted by clamping all low coefficients to zero. We used this
second mathod during an experiment and the results are shown in Figure 10.
As the figure shows, increasing the threshold at which we clamp to zero
reduces the image quality (as measured by peak signal-to-noise).
Figure 10: PSNR vs. Zero-Cutoff Threshold
|
Figure 11 shows the results of an experiment designed to evaluate selective
inclusion. The probability that a column is added to the database
decreases as the database size grows. This is the behavior that we both
expected and desired. When the database is small, new columns are very
useful in increasing reconstrution quality. However, as the database
grows, additional columns are less useful: the marginal utility decreases.
At some point, columns supply so little additional information, they are
no longer worth storing in the database at all. The exact behavior depends
both on the reconstruction algorithm (without which we can't decide what is
'useful') and an error tolerance. Figure 11 shows three different error
tolerances, all using the same reconstruction algorithm.
Figure 11: Probability That Column is Added vs. Number
of Columns.
(a) Low tolerance for reconstruction error
(b) Moderate tolerance for reconstruction error
(c) High tolerance for reconstruction error
|
The benefit of selective inclusion is improved compression. No longer are
marginally useful samples stored in the database. Figure 12 shows this
behavior. The three curves correspond to the same 3 error thresholds as
Figure 11.
Figure 12: Compression Ratio As Number of Columns Increases.
(a) Low tolerance for reconstruction error
(b) Moderate tolerance for reconstruction error
(c) High tolerance for reconstruction error
|
Finally, Figure 13 shows a table of compressed file sizes to compare the
behavior of IRW with that of traditional JPEG compression. As the statistics
show, IRW provides comparable compression rates to JPEG while still supporting
added benefits such as incremental updates, sublinear growth, and fine-grained
data access.
| Real Scene Using JPEG |
| Quality | 80 | 84 | 88 | 92 | 96 | 100 |
| Size (KB) | 11.5 | 13.9 | 17.2 | 21.6 | 34.0 | 63.7 |
| PSNR | 44.99 | 45.52 | 46.23 | 47.05 | 48.40 | 50.34 |
|
| Real Scene Using IRW |
| Quality | 7 | 6 | 5 | 4 | 3 | 2 |
| Size (KB) | 17.8 | 21.4 | 26.6 | 33.9 | 45.6 | 61.4 |
| PSNR | 37.29 | 38.05 | 39.05 | 40.10 | 41.18 | 43.15 |
|
| Synthetic Scene Using JPEG |
| Quality | 80 | 84 | 88 | 92 | 96 | 100 |
| Size (KB) | 12.9 | 14.8 | 17.5 | 21.6 | 31.8 | 56.0 |
| PSNR | 36.29 | 35.83 | 36.21 | 36.64 | 36.58 | 36.61 |
|
| Synthetic Scene Using IRW |
| Quality | 7 | 6 | 5 | 4 | 3 | 2 |
| Size (KB) | 45.5 | 50.0 | 52.2 | 54.2 | 57.6 | 84.2 |
| PSNR | 38.59 | 39.39 | 40.26 | 41.40 | 43.17 | 45.01 |
Figure 13: Compression Rates: IRW vs. JPEG
|
Back to the
start of the IRW Overview section...
|
Back to the top...
Internal Project Resources
The IRW project maintains a library of information for use by project
members. To access this portion of the web page, you must either be on the
UNC-CH Computer Science network or have a valid login and password. To
view these resources,
click here.
|