UNDER CONSTRUCTION!!!


COMP254 - Image Processing and Analysis
Final Project


Landmark-Based
Statistical Shape Classification
using Thin-Plate Splines

RUI BASTOS (Warp me!)

This project was developed in conjunction with Gregory Pruett.

Outline

  1. Shape Classification with Image Warping
  2. Mean Shape and Affine Transformations
  3. Results
  4. Applications
  5. Limitations
  6. Conclusions
  7. Future Work
  8. References

Shape Classification
Based on Bending Energy

We are interested in classifying objects according to their 3D shape using 2D projections (images). Given a set of images of different objects that we use to train the classifier, we would like to process any new image and classify it to the object it represents. In this project we investigated the application of image warping using thin-plate splines to shape classification.

IMAGE WARPING

Given a pair of images, image warping is a one-to-one mapping describing how the pixels of one image should be displaced to represent the spatial structure represented in the other image. Image warping operates on pixel locations (does not change pixel colors) and can cause compressions and stretches in the warped images (which may require resampling). We use the thin-plate splines as the continuous representation between samples in order to perform the resampling.

LANDMARKS

In order to capture the shape of an object we represent it as a set of landmarks (important points on the structure of the object) bending an imaginary flat thin-plate placed above its 2D projection (image). Landmarks should be placed so that they represent characteristic points on the object that can be used to distinguish two objects of a same class (e. g., for two faces: eyes, nose, ears, etc.) The landmarks should try to approximate regions where we see visual discontinuities in the object. Given two images to be warped, we have to specify landmarks in both the source and the destination image. The landmarks must be in the same number in both images and must be specified in the same correspondence order in both images. A landmark representing a left eye on the source image must have a correspondent landmark representing a left eye on the destination image.

BENDING ENERGY CLASSIFICATION

Given a set of landmarks for each standard object, we can compute the bending energy required to warp a new set of landmarks (from an image to be classified) into each of the standard shapes; i. e., the energy required to bend the thin-plate associated the new set of landmarks to represent the corresponding standard landmark sets. Once we have the bending energy required to bend the new set of landmarks to each standard object, we classify the new image to that object it requires less energy to be warped to. This classification process can be divided in two steps:
  1. TRAINING STEP:
    Select the landmarks for each of the standard objects.
  2. CLASSIFICATION STEP:
    Select the corresponding landmarks on the new image;
    Compute the bending energy required to warp the new image into the standard ones, and
    Classify the new shape to the standard object which requires less bending energy.

The bending energy required to warp one set of landmarks into another set is invariant to affine transformations (translation, rotation, scale, reflection, and shear). This feature makes it possible to classify a new image without having to normalize it to a common origin, scale, and orientation.


Mean Shape and Affine Transformations

In order to improve the shape representation of the standard objects, we can average its landmark representation from several 2D projections (images) and compute its mean shape. Due to affine transformations in 3D or in 2D spaces, the landmarks of different views of the same object have to be normalized to a common origin, scale, and orientation before being averaged.

Given more than one view of an object, we identify the landmarks in each image, compute the centroid of the landmarks of each view (average the landmark points) and define landmark vectors from each centroid to the landmark points. Label accordingly corresponding landmarks of each view (in this work we label landmarks with colors). The centroid and the labels help dealing with affine transformations among the different views:

After normalizing all the landmark representations of an object to a common origin, scale, and orientation, we can see that corresponding landmarks form clusters when superimposing all the landmark representations. We compute the mean shape of the object averaging corresponding landmarks of all views. Increasing the number of views used to compute the mean shape improves the confidence of the classification.

Results

Our system implements the mean shape computation, the bending energy evaluation, and the shape classification. All the figures presented below are organized as:

Shape classification for different models of cars:

Shape classification for faces recognition:

Shape classification for likeness:

We can easily see that the bending energy of a shape increases with the number of landmarks used. We can also notice that increasing the number of landmarks used to represent a certain shape we increase the accuracy of the landmark and bending energy representation. Consequently, increasing the number of landmarks we can increase the confidence of the classifications.


Applications


Limitations


Conclusions

The bending energy based classifier has shown to be reliable and robust. Even when using small numbers of landmarks (around 10 in the examples above), the classifications were coherent with the expected results. However, we could improve the confidence even more using larger numbers of landmarks (increasing the bending energy) per class.

Increasing the number of views used to compute the mean shape we can also improve the confidence of the classifications.


References


Last updated, Mon Apr 29 23:54:02 EDT 1996 by bastos@cs.unc.edu.