COMP 776 Spring 2011

Final Assignment: Bag-of-Features Image Classification (with Competition!)

Due date: Saturday, April 23, 11:59PM (report), Monday, April 25, 11:59PM (contest)

The Data (12 MB)

(source: Caltech Vision Group)

The goal of the assignment is to implement a system for bag-of-features image classification. The author of the highest-performing system will get a prize (see below)! The goal is to perform four-class image classification, with the four classes being airplanes, motorbikes, faces, and cars. The data file contains training and validation subdirectories for each category. The training subdirectories contain 40 images each, and the validation subdirectories contain 100 images each. For the initial phase of the assignment, you will train your system on the training images and evaluate its performance on the validation set. By the end of Saturday, April 23, you will turn in your code and report as usual, and tell me the best classification accuracy you achieved on the validation set. After that, I will make the "real" test set available, and you will run your system on the test set only once and let me know the accuracy you got by the end of Monday, April 25th (see bottom of this page for detailed instructions). This information will be used to judge the recognition contest.

System Outline and Implementation Details

  1. Feature extraction. You can use any of the following methods:
    • Sampling of fixed-size image patches on a regular grid. You can use either a single image size or several different sizes.
    • Sampling of random-size patches at random locations.
    • Regions produced by your blob detector from Assignment 2. It is also fine to use the blob detector provided as the solution to the assignment or to download somebody else's detector from the Web.
    • Fixed-size patches sampled around corner locations (the usual corner detector).
    • Patches produced by any other detector you download.

  2. Feature description. You can use either the raw patches themselves (possibly downsampled or intensity-normalized), compute SIFT descriptors of the patches, or use any other descriptor you find in the literature, e.g., a color histogram. Here is sample code for computing SIFT descriptors of circular regions, such as the ones returned by a blob detector from Assignment 2. Note that this code is not rotation-invariant, i.e., it does not attempt to normalize the patches by rotating them so that the horizontal direction is aligned with the dominant gradient orientation of the patch. However, rotation invariance is not really necessary for the assignment.

  3. Dictionary computation. Run k-means clustering (kmeans function in MATLAB) on a subset of all training features to learn the dictionary centers. A reasonable dictionary size is about 500, but feel free to experiment with several different sizes.

  4. Feature quantization and histogram computation. For each feature in a training or a test image, find the index of the nearest codevector in the dictionary. You may want to use this code for fast computation of squared Euclidean distances between two sets of vectors (i.e., all descriptors in an image and the codebook). Following quantization, represent each image by the histogram of these indices (check out MATLAB's hist function). Because different images can have different numbers of features, the histograms should be normalized to sum to one.

  5. Classifier training. The simplest option for this part of the assignment is a k-nearest-neighbor (kNN) classifier. MATLAB has a knnclassify function, but it only appears to work with a few pre-defined distance functions. If you want to experiment with a different distance function such as chi2, you may have to implement your own kNN function. To have a better chance of winning the contest, you should try to train a support vector machine (SVM) classifier. MATLAB includes SVM training and testing functions: svmtrain and svmclassify. Alternatively, you can download an SVM package from the Web. Here is one SVM package that is fairly easy to integrate with MATLAB.

Grading

For full credit, you should implement a working, fully documented system by making a single implementation choice for each of the above components, and obtain results that are (significantly) above chance. The performance of your system should be measured in terms of the classification rate, or the percentage of all test images correctly classified by your system. In your report, please make sure to prominently list the best classification rate achieved by your algorithm on the validation sets for each category.

The grading will be based primarily on your report. I do not intend to run your code, though you must include it, and I may be looking at some parts of it. The report should thoroughly document everything you implemented and all important experimental findings (recognition rates for different versions of features, descriptors, classifiers, etc.). If you download code from the Web, state exactly where you downloaded and how you used the code. DO NOT download somebody else's complete recognition system, only individual pieces that help with some aspects of the assignment.

Bonus points

  • Compare performance of different implementation choices for one or more system components, and/or investigate the effect of important system parameters (dictionary size, number of training images used, k in kNN, etc.). Wherever relevant, feel free to discuss computation time in addition to classification rate.
  • Download or implement alternative components (feature detectors, feature descriptors, classifiers) for a bag-of-features pipeline and compare results with the more basic implementation options.
  • Explore alternatives to bag-of-features classification. For example, try this code for extraction of global "gist" descriptors and use these descriptors to classify images with kNN or SVM.
Note that when deciding how many bonus points to assign, I will take into account the amount of extra work done by everybody in the class. You should aim to significantly exceed the "average" amount of effort in order to get a bonus.

Competition!!!

In an attempt to make this assignment more fun and exciting, I am adding a competition aspect. The person who achieves the highest classification rate on the test set to be released after the initial due date will receive bonus points and a valuable prize that will be disclosed by me on the last day of class. Apart from the competition, the classification rate of your algorithm will not be strongly considered as part of your grade, unless it is a reflection of serious implementation mistakes.

Turning in the Assignment

There are two due dates, one for the main report, and one for the contest. Please turn in your report via Blackboard by 11:59PM on Saturday, April 23rd. Don't forget that the report must prominently feature the overall classification rate of your system on the validation set. After I receive everybody's reports, I will send out an email with a link to the test set file. To enter the contest, run your system on the test set once and email me your classification rate and the indices of images you got wrong by 11:59PM on Monday, April 25th. The subject of the email should be "COMP 776 recognition contest". The contest results will be announced during the last class on Tuesday.