COMP 776 Spring 2010

Final Assignment: Bag-of-Features Image Classification (with Competition!)

Due date: Sunday, April 25, 5PM

The Data (45 MB)

(source: Caltech Vision Group)

The goal of the assignment is to implement a system for bag-of-features image classification. The author of the highest-performing system will get a prize (see below)! The goal is to perform four-class image classification, with the four classes being airplanes, motorbikes, faces, and cars. The data file contains training and test subdirectories for each category. The test subdirectories contain 50 images each, and the training subdirectories contain up to 500 images each. You must test your system on all the test images, and train it on at least 40 training images per class. Keep in mind that using more training data will almost certainly result in better performance. However, if your computational resources are limited and your system is slow, it's OK to use less training data to save time. You can also experiment with splitting up the training images into two subsets, one for learning the visual dictionary, and one for learning the classifier.

System Outline and Implementation Details

  1. Feature extraction. You can use any of the following methods:
    • Sampling of fixed-size image patches on a regular grid. You can use either a single image size or several different sizes.
    • Sampling of random-size patches at random locations.
    • Regions produced by your blob detector from Assignment 2. It is also fine to use the blob detector provided as the solution to the assignment or to download somebody else's detector from the Web.
    • Fixed-size patches sampled around corner locations (sample corner detector).
    • Patches produced by any other detector you download.

  2. Feature description. You can use either the raw patches themselves (possibly downsampled or intensity-normalized), compute SIFT descriptors of the patches, or use any other descriptor you find in the literature, e.g., a color histogram. Here is sample code for computing SIFT descriptors of circular regions, such as the ones returned by a blob detector from Assignment 2. Note that this code is not rotation-invariant, i.e., it does not attempt to normalize the patches by rotating them so that the horizontal direction is aligned with the dominant gradient orientation of the patch. However, rotation invariance is not really necessary for the assignment.

  3. Dictionary computation. Run k-means clustering (kmeans function in MATLAB) on a subset of all training features to learn the dictionary centers. Set the dictionary size to about 500, or experiment with several different sizes.

  4. Feature quantization and histogram computation. For each feature in a training or a test image, find the index of the nearest codevector in the dictionary. You may want to use this code for fast computation of squared Euclidean distances between two sets of vectors (i.e., all descriptors in an image and the codebook). Following quantization, represent each image by the histogram of these indices (check out MATLAB's hist function). Because different images can have different numbers of features, the histograms should be normalized to sum to one.

  5. Classifier training. The simplest options for this part of the assignment are a k-nearest-neighbor (kNN) classifier or a Naive Bayes classifier. MATLAB has a knnclassify function, but it only appears to work with a few pre-defined distance functions. If you want to experiment with a different distance function such as chi2, you may have to implement your own kNN function. For bonus points (and to have a better chance of winning the contest), you should try to train a support vector machine (SVM) classifier. MATLAB includes SVM training and testing functions: svmtrain and svmclassify. Alternatively, you can download an SVM package from the Web. Here is one SVM package that is fairly easy to integrate with MATLAB.

Grading

For full credit, you should implement a working, fully documented system by making a single implementation choice for each of the above components, and obtain results that are (significantly) above chance. The performance of your system should be measured in terms of the classification rate, or the percentage of all test images correctly classified by your system. Please make sure to prominently list the best classification rate achieved by your algorithm by training on the first 40 images from each training directory. This is the number that will be entered in the contest.

The grading will be based primarily on your report. I do not intend to run your code, though you must include it, and I may be looking at some parts of it. The report should thoroughly document everything you implemented and all important experimental findings (recognition rates for different versions of features, descriptors, classifiers, etc.). If you download code from the Web, state exactly where you downloaded and how you used the code. DO NOT download somebody else's complete recognition system, only individual pieces that help with some aspects of the assignment.

Bonus points

  • Use SVMs for classification.
  • Compare performance of different implementation choices for one or more system components, and/or investigate the effect of important system parameters (dictionary size, number of training images used, k in kNN, etc.). Wherever relevant, feel free to discuss computation time in addition to classification rate.
  • Download or implement alternative components (feature detectors, feature descriptors, classifiers) for a bag-of-features pipeline and compare results with the more basic implementation options.
  • Explore alternatives to bag-of-features classification. For example, try this code for extraction of global "gist" descriptors and use these descriptors to classify images with kNN or SVM.
Note that when deciding how many bonus points to assign, I will take into account the amount of extra work done by everybody in the class. Roughly speaking, you should aim to exceed the "average" amount of effort in order to get a bonus.

Competition!!!

In an attempt to make this assignment more fun and exciting, I am adding a competition aspect. The person who achieves the highest classification rate on the dataset with training on the first 40 training images from each directory will receive bonus points and a valuable prize that will be disclosed by me on the last day of class. Apart from the competition, the classification rate of your algorithm will not be strongly considered as part of your grade, unless it is a reflection of serious implementation mistakes.

Turning in the Assignment

As usual, please email me your report in PDF format and your code. As usual, the file names should be firstname_lastname.pdf and firstname_lastname.zip, and the email subject should be COMP776 Final Assignment. The deadline is 5PM, Sunday, April 25 -- ABSOLUTELY NO LATE SUBMISSIONS!