COMP 776 Spring 2010
Final Assignment: Bag-of-Features Image Classification (with Competition!)
Due date: Sunday, April 25, 5PM
(source: Caltech Vision Group)
The goal of the assignment is to implement a system for bag-of-features image classification.
The author of the highest-performing system will get a prize (see below)! The goal is to perform
four-class image classification, with the four classes being airplanes, motorbikes, faces,
and cars. The data file contains training and test subdirectories for each category.
The test subdirectories contain 50 images each, and the training subdirectories contain up to 500
images each. You must test your system on all the test images, and train it on
at least 40 training images per class.
Keep in mind that using more training data will almost certainly result in better performance.
However, if your computational resources are limited and your system is slow, it's OK to use less
training data to save time. You can also experiment with splitting up the training images into
two subsets, one for learning the visual dictionary, and one for learning the classifier.
System Outline and Implementation Details
-
Feature extraction. You can use any of the following methods:
- Sampling of fixed-size image patches on a regular grid. You can use either a single image size or several different sizes.
- Sampling of random-size patches at random locations.
- Regions produced by your blob detector from Assignment 2. It is also fine to use the blob detector provided as the solution
to the assignment or to download somebody else's detector from the Web.
- Fixed-size patches sampled around corner locations (sample corner detector).
- Patches produced by any other detector you download.
-
Feature description. You can use either the raw patches themselves (possibly downsampled or intensity-normalized),
compute SIFT descriptors of the patches, or use any other descriptor you find in the literature, e.g., a color
histogram. Here is sample code for computing SIFT descriptors of circular regions, such as the
ones returned by a blob detector from Assignment 2. Note that this code is not rotation-invariant, i.e., it does
not attempt to normalize the patches by rotating them so that the horizontal direction is aligned with the dominant
gradient orientation of the patch. However, rotation invariance is not really necessary for the assignment.
-
Dictionary computation. Run k-means clustering (kmeans function in MATLAB) on a subset
of all training features to learn the dictionary centers.
Set the dictionary size to about 500, or experiment with several different sizes.
-
Feature quantization and histogram computation. For each feature in a training or a test image,
find the index of the nearest codevector in the dictionary. You may want to use this code for
fast computation of squared Euclidean distances between two sets of vectors (i.e., all descriptors in an
image and the codebook). Following quantization, represent each image by the histogram of
these indices (check out MATLAB's hist function). Because different images can have different
numbers of features, the histograms should be normalized to sum to one.
-
Classifier training. The simplest options for this part of the assignment are a k-nearest-neighbor (kNN)
classifier or a Naive Bayes classifier. MATLAB has a knnclassify function,
but it only appears to work with a few pre-defined distance functions. If you want to experiment with a
different distance function such as chi2, you may have to implement your own kNN function.
For bonus points (and to have a better chance of winning the contest), you should try to train a support vector
machine (SVM) classifier. MATLAB includes SVM training and testing functions: svmtrain and
svmclassify. Alternatively, you can download an SVM package from the Web.
Here is one
SVM package that is fairly easy to integrate with MATLAB.
Grading
For full credit, you should implement a working, fully documented system by making a single
implementation choice for each of the above components, and obtain results that are (significantly) above
chance. The performance of your system should be measured in terms of the classification rate,
or the percentage of all test images correctly classified by your system. Please make sure to
prominently list the best classification rate achieved by your algorithm by training on the first
40 images from each training directory. This is the number that will be entered in the contest.
The grading will be based primarily on your report. I do not intend to run your code,
though you must include it, and I may be looking at some parts of it. The report should thoroughly document
everything you implemented and all important experimental findings (recognition rates for different
versions of features, descriptors, classifiers, etc.). If you download code from the Web, state exactly
where you downloaded and how you used the code. DO NOT download somebody else's complete recognition
system, only individual pieces that help with some aspects of the assignment.
Bonus points
- Use SVMs for classification.
- Compare performance of different implementation choices for
one or more system components, and/or investigate the effect of important system parameters
(dictionary size, number of training images used, k in kNN, etc.). Wherever relevant, feel
free to discuss computation time in addition to classification rate.
- Download or implement alternative components (feature detectors, feature descriptors,
classifiers) for a bag-of-features pipeline and compare results with the more basic implementation options.
- Explore alternatives to bag-of-features classification.
For example, try this code
for extraction of global "gist" descriptors
and use these descriptors to classify images with kNN or SVM.
Note that when deciding how many bonus points to assign, I will take into account the amount
of extra work done by everybody in the class. Roughly speaking, you should aim to exceed the
"average" amount of effort in order to get a bonus.
Competition!!!
In an attempt to make this assignment more fun and exciting, I am adding a competition aspect.
The person who achieves the highest classification rate on the dataset with training on the
first 40 training images from each directory will receive bonus points
and a valuable prize that will be disclosed by me on the last day of class. Apart from the competition,
the classification rate of your algorithm will not be strongly considered as part of your grade,
unless it is a reflection of serious implementation mistakes.
Turning in the Assignment
As usual, please email me your report in PDF format and your code. As usual, the file names should be
firstname_lastname.pdf and firstname_lastname.zip, and the email subject should be
COMP776 Final Assignment.
The deadline is 5PM, Sunday, April 25 -- ABSOLUTELY NO LATE SUBMISSIONS!
|