Key Ideas in Image Analysis and Vision

This page summarizes some key ideas about image analysis and vision that form the basis of our research at UNC. This should serve well as an introduction to our research philosophy and approaches.

We seek to understand vision as a unified, comprehensible process with solid mathematical and computational foundations.
We reject the notion that vision is a collection of ad hoc hacks ("God is a hacker") and that therefore no sensible, unified understanding of vision is possible.
We solve image analysis problems from various disciplines both to contribute to the scientific enterprise of our application clients and to sharpen our methods and techniques.
By working with researchers in other disciplines, our research progress can be judged by critical outsiders, not just by how we feel about our results. This keeps us focused on substantive issues and not on esoterica of interest only to ourselves. For example, if 3 mm accuracy in locating an object in an image is sufficient for an application, the push for 0.5 mm accuracy, which will be an expensive, difficult enterprise, must be justified by other criteria or applications.
Methods that do not generate new insights about the nature of vision must justify themselves by their immediate utility for applications.
We do not try to generate new methods to solve each particular image analysis problem, but rather to customize a few basic, powerful models and methods to fit the application at hand. That way, the solution to each new problem strengthens our insights about our broader models of vision.

The power of our image analysis research for applications comes from our goal of being able to customize an abstractly-defined "visual system" to meet the needs of specific applications. An effective way to characterize our research programs is to say that we are developing languages for describing image analysis problems in which the description of a problem contains the keys to its solution. Thus, the ability to effectively describe an image analysis task is central to our efforts.

There are a few general principles that underlie our approach to image analysis research:

Multiscale analysis is essential to vision, not only because objects can appear at different scales and a single object may exist across multiple scales, but also because the act of dispersing image information through scale is a powerful analysis method (spectroscopy) on its own merits.

Geometric analysis is essential to vision since it is the geometric stucture of an image (its spatial scale and derivative structure) that conveys the image content.

Statistical methods are essential to vision to account for the variability in the appearance of objects, which derives from the variability of semantically similar objects in the real world.

Much of the geometric analysis of an image needs to preceed the extraction of symbolic or iconic descriptions of the image. Thus, we compute extensive geometric measurements having continuous values (Taylor series, "boundariness", "medialness", texture) before we iconify any representation of content (boundary, medial, or landmark location, texture class, etc.).

Edges are not an adequate foundation for vision. Associating a mathematical edge with an object boundary is completely unjustified: edges can exist at locations other than boundaries and some visible boundaries between objects lack edges.

An effective, compact, and rich description of object shape consists of a small number of medial loci with associated boundary loci.