COMP 776 Spring 2009

Assignment 2: Scale-space blob detection

Due date: February 24th at 5PM

The goal of the assignment is to implement a Laplacian blob detector as discussed in the February 5th lecture (PPT, PDF).

Algorithm outline:

  1. Generate a Laplacian of Gaussian filter.
  2. Build a Laplacian scale space, starting with some initial scale and going for n iterations:
    1. Filter image with scale-normalized Laplacian at current scale.
    2. Save square of Laplacian response for current level of scale space.
    3. Increase scale by a factor k.
  3. Perform nonmaximum suppression in scale space.
  4. Display resulting circles at their characteristic scales.

Details

  • Don't forget to convert images to grayscale (rgb2gray command) and double (im2double).

  • When filtering the image with the Laplacian, you should be careful in computing the filter mask size. Hint: if all your blobs look systematically shifted over from where they should be, you are making a very basic mistake in setting the filter mask size.

  • It is relatively inefficient to repeatedly filter the image with a kernel of increasing size. Instead of increasing the kernel size by a factor of k, try downsampling the image by a factor 1/k. In that case, you will have to upsample the result or do some interpolation in order to find maxima in scale space. Hint: scale normalization may not work the same way when you downsample the image instead of increasing the scale of the filter.

  • You have to choose the initial scale, the factor k by which the scale is multiplied each time, and the number of levels in the scale space. I typically set the initial scale to 2, and use 10 to 15 levels in the scale pyramid. The multiplication factor should depend on the largest scale at which you want regions to be detected.

  • You may want to use a three-dimensional array to represent your scale space. It would be declared as follows:

    scale_space = zeros(h,w,n); % [h,w] - dimensions of image, n - number of levels in scale space

    Then scale_space(:,:,i) would give you the ith level of the scale space. Alternatively, if you are storing different levels of the scale pyramid at different resolutions, you may want to use a cell array, where each "slot" can accommodate a different data type or a matrix of different dimensions. Here is how you would use it:

    scale_space = cell(n,1); %creates a cell array with n "slots"
    scale_space{i} = my_matrix; % store a matrix at level i


  • To perform nonmaximum suppression in scale space, you may first want to do nonmaximum suppression in each 2D slice separately. For this, you may find functions nlfilter, colfilt or ordfilt2 useful. To extract the final nonzero values (corresponding to detected regions), you may want to use the find function.

  • You also have to set a threshold on the squared Laplacian response above which to report region detections. You should play around with different values and choose one you like best.

  • To display the detected regions as circles, you can use this function (or feel free to write your own). Hint: Don't forget that there is a multiplication factor that relates the scale at which a region is detected to the radius of the circle that most closely "approximates" the region.

  • If you decide to experiment with different implementation choices for building the scale space and nonmaximum suppression, you may want to compare these choices according to their computational efficiency. To time different routines, use tic and toc commands.

For bonus points

  • Implement the difference-of-Gaussian pyramid as mentioned in class and described in David Lowe's paper. Compare the results and the running time to the direct Laplacian implementation.

  • Implement the affine adaptation step to turn circular blobs into ellipses as shown in the lecture (just one iteration is sufficient). The selection of the correct window function is essential here. You should use a Gaussian window that is a factor of 1.5 or 2 larger than the characteristic scale of the blob. Note that the lecture slides show how to find the relative shape of the second moment ellipse, but not the absolute scale (i.e., the axis lengths are defined up to some arbitrary constant multiplier). A good choice for the absolute scale is to set the sum of the major and minor axis half-lengths to the diameter of the corresponding Laplacian circle.

  • The Laplacian has a strong response not only at blobs, but also along edges. However, recall from the class lecture that edge points are not "repeatable". So, implement an additional thresholding step that computes the Harris response at each detected Laplacian region and rejects the regions that have only one dominant gradient orientation (i.e., regions along edges). If you have implemented the affine adaptation step, these would be the regions whose characteristic ellipses are close to being degenerate (i.e., one of the eigenvalues is close to zero). Show both "before" and "after" detection results.

Test images

Here are four images to test your code, and sample output images for your reference. Keep in mind, though, that your output may look different depending on your threshold, range of scales, and other implementation details. In addition to the images provided, also run your code on at least two images of your own choosing.

What to hand in

You should prepare a (very brief) report including the following:
  • An explanation of any "interesting" implementation choices that you made.
  • An explanation of parameter values you have tried and which ones you found to be optimal.
  • A discussion of any extensions or bonus features you have implemented.
  • Output of your detector on all test images for your "favorite" parameter settings.
Hand in your report (as a PDF file), code, and your two additional test images in a single zip file. If the zip file is bigger than 2MB, do not email it to me. Put it on the Web and send me the URL.

Grading

  • A correct "brute-force" implementation of the detector (i.e., filtering with Laplacian kernel of increasing size at each interation) gets you (the equivalent of) H-.
  • A correct efficient implementation (downsampling the image or using a difference-of-Gaussians pyramid) gets you an H.
  • A correct efficient implementation plus one or more "bonus" features gets you an H+.
  • An incorrect implementation gets you a P+ or below (depending on the nature and seriousness of the mistakes).

Helpful pointers