Research Topics
My current research interests cover several aspects of 3D modeling from
images. For more information, see also the
publications page.
Structure from Motion
What Can Missing Correspondences Tell Us About 3D Structure and Motion?
Practically all existing approaches to structure and motion computation
use only positive image correspondences to verify the camera pose
hypotheses. Incorrect epipolar geometries are solely detected by
identifying outliers among the found correspondences. Ambigous patterns in
the images are often incorrectly handled by these standard methods. In
this work we propose two approaches to overcome such problems. First, we
apply non-monotone reasoning on view triplets using a Bayesian
formulation. In contrast to two-view epipolar geometry, image triplets
allow the prediction of features in the third image. Absence of these
features (i.e. missing correspondences) enables additional inference about
the view triplet. Furthermore, we integrate these view triplet handling
into an incremental procedure for structure and motion computation. Thus,
our approach is able to refine the maintained 3D structure when additional
image data is provided.
C. Zach, A. Irschara, H. Bischof. CVPR 2008.
Towards Wiki-based Dense City Modeling
This work reports on the advances and on the current status of a
terrestrial city modeling approach, which uses images contributed by
end-users as input. Hence, the Wiki principle well known from textual
knowledge databases is transferred to the goal of incrementally building a
virtual representation of the occupied habitat. In order to achieve this
objective, many state-of-the-art computer vision methods must be applied
and modified according to this task. We describe the utilized 3D vision
methods and show initial results obtained from the current image database
acquired by in-house participants.
A. Irschara, C. Zach, H. Bischof. VRML Workshop in conjunction with ICCV 2007.
Multi-view stereo and dense 3D modeling
A Globally Optimal Algorithm for Robust TV-L^1 Range Image Integration
Robust integration of range images is an important task for building
high-quality 3D models. Since range images, and in particular range maps
from stereo vision, may have a substantial amount of outliers, any
integration approach aiming at high-quality models needs an increased
level of robustness. Additionally, a certain level of regularization is
required to obtain smooth surfaces. Computational efficiency and global
convergence are further preferable properties. The contribution of this
paper is a unified framework to solve all these issues. Our method is
based on minimizing an energy functional consisting of a total variation
(TV) regularization force and an L^1 data fidelity term. We present a
novel and efficient numerical scheme, which combines the duality principle
for the TV term with a point-wise optimization step. We demonstrate the
superior performance of our algorithm on the well-known Middlebury
multi-view database and additionally on real-world multi-view images.
C. Zach, T. Pock, H. Bischof. ICCV 2007.
Mumford-Shah Meets Stereo: Integration of Weak Depth Hypotheses
Recent results on stereo indicate that an accurate segmentation is crucial
for obtaining faithful depth maps. Variational methods have successfully
been applied to both image segmentation and computational stereo. In this
paper we propose a combination in a unified framework. In particular, we
use a Mumford-Shah-like functional to compute a piecewise smooth depth map
of a stereo pair. Our approach has two novel features: First, the
regularization term of the functional combines edge information obtained
from the color segmentation with flow-driven depth discontinuities
emerging during the optimization procedure. Second, we propose a robust
data term which adaptively selects the best matches obtained from
different weak stereo algorithms. We integrate these features in a
theoretically consistent framework. The final depth map is the minimizer
of the energy functional, which can be solved by the associated functional
derivatives. The underlying numerical scheme allows an efficient
implementation on modern graphics hardware. We illustrate the performance
of our algorithm using the Middlebury database as well as on real imagery.
T. Pock, C. Zach, H. Bischof. CVPR 2007.
Efficient computer vision methods on the GPU
A Duality Based Approach for Realtime TV-L^1 Optical Flow
Variational methods are among the most successful approaches to calculate
the optical flow between two image frames. A particularly appealing
formulation is based on total variation (TV) regularization and the robust
L^1 norm in the data fidelity term. This formulation can preserve
discontinuities in the flow field and offers an increased robustness
against illumination changes, occlusions and noise. In this work we
present a novel approach to solve the TV-L^1 formulation. Our method
results in a very efficient numerical scheme, which is based on a dual
formulation of the TV energy and employs an efficient point-wise
thresholding step. Additionally, our approach can be accelerated by modern
graphics processing units. We demonstrate the real-time performance (30
fps) of our approach for video inputs at a resolution of 320x240 pixels.
(Move the mouse over the source images to see the other frame.)
C. Zach, T. Pock, H. Bischof. DAGM 2007.