COMP 776: Computer Vision

Spring 2008, T TH 3:30-4:45, SN 115

Instructor: Svetlana Lazebnik  (lazebnik -at- cs.unc.edu)

Quick links: syllabus, schedule, useful resources

Overview

In the simplest terms, computer vision is the discipline of "teaching machines how to see." This field dates back more than forty years, but the recent explosive growth of digital imaging technology makes the problems of automated image interpretation more exciting and relevant than ever. There are two major themes in the computer vision literature: 3D geometry and recognition. The first theme is about using vision as a source of metric 3D information: given one or more images of a scene taken by a camera with known or unknown parameters, how can we go from 2D to 3D, and how much can we tell about the 3D structure of the environment pictured in those images? The second theme, by contrast, is all about vision as a source of semantic information: can we recognize the objects, people, or activities pictured in the images, and understand the structure and relationships of different scene components just as a human would? This course will strive to provide a unified perspective on the different aspects of computer vision, and give students the ability to understand vision literature and implement components that are fundamental to many modern vision systems.

Prerequisites: Basic knowledge of probability, linear algebra, and calculus. MATLAB programming experience and previous exposure to image processing are desirable, but not required.

Textbook: Computer Vision: A Modern Approach by David Forsyth and Jean Ponce is the recommended textbook for the course. The instruction will follow this textbook very loosely. Many additional handouts and notes will be distributed throughout the course.

Grading: Computer vision is a very hands-on subject. For this reason, the coursework will primarily consist of implementation (please make sure you have access to MATLAB with the Image Processing Toolbox installed). There will be three or four minor programming assignments and a final project. The weights assigned to different course components will be as follows:
  • Assignments: 40%
  • Final project: 50%
  • Participation: 10%

Syllabus

I. Image formation
  • Camera models
  • Light and color
  • Linear filters and edges
  • Feature extraction (corners and blobs)
II. Grouping and fitting
  • Hough transform
  • RANSAC
  • Alignment
III. Geometric vision
  • Camera calibration
  • Epipolar geometry
  • Two-view and multi-view stereo
  • Structure from motion
IV. Recognition
  • Bags of features
  • Generative and discriminative models
  • Face detection and recognition
V. "Miscellaneous"
  • Segmentation
  • Optical flow
  • Tracking

Schedule

Date Topic Readings, assignments
January 10 What is computer vision?
PPT (20MB), PDF (5MB)
Resource: MATLAB tutorial
January 15 Cameras: PPT (19MB), PDF (4MB) Reading: F&P ch. 1
January 17 Radiometry:
PPT (10MB), PDF (1.3MB)
Reading: F&P ch. 4, 5
January 22 Color: PPT (28MB), PDF (2MB) Reading: F&P ch. 6
January 24 Linear filtering:
PPT (7MB), PDF (3.5MB)
Reading: F&P ch. 7
January 29 Edge detection:
PPT (5.5MB), PDF (2MB)
Reading: F&P ch. 8
January 31 Corner and blob detection:
PPT (9MB), PDF (3MB)
Homework: Assignment 1 out
Resource: Sample Harris detector code
February 5 Hough transform:
PPT (6.5MB), PDF (1.5MB)
Reading: F&P ch. 15
February 7 Fitting: PPT (1.6MB), PDF (0.3MB) Reading: F&P sec. 3.1, ch. 15
February 12 Image alignment:
PPT (11MB), PDF (1.8MB)
Reading: Distinctive image features from scale-invariant keypoints
February 14 Alignment concluded Assignment 1 due
Homework: Start looking at final project options
February 19 Camera calibration, epipolar geometry:
PPT (5.5MB), PDF (1.5MB)
Reading: F&P ch. 2, 3, sec. 10.1
February 21 Stereo: PPT (18MB), PDF (3MB) Reading: F&P ch. 11
February 26 Stereo continued Homework: Assignment 2 out
February 28 Multi-view stereo:
PPT (35MB), PDF (2.5MB)
Project proposal due (project options)
March 4 Multi-view stereo concluded  
March 6 Structure from motion:
PPT (5MB), PDF (1MB)
Reading: F&P sec. 12.3, 12.4, 13.3.1, 13.4, 13.5
March 18 Intro to recognition:
PPT (22MB), PDF (4.5MB)
Assignment 2 due
March 20 Recognition: Concepts and issues
PPT (12MB), PDF (3.5MB)
Resource: ICCV 2005/CVPR 2007 Short Course on Object Recognition
March 25 Bags of features:
PPT (7MB), PDF (2MB)
 
March 27 Discriminative models:
PPT (7MB), PDF (1.5MB)
Reading: F&P sec. 22.1, 22.2, 22.5
April 1 Generative models (see March 27 slides)  
April 3 Adding spatial information:
PPT (8MB), PDF (2MB)
Project progress report due
April 8 Eigenfaces, face detection:
PPT (4MB), PDF (1.5MB)
Reading: F&P sec. 22.3, Robust Real-Time Face Detection
Homework: Assignment 3 out
April 10 Face detection concluded  
April 15 Segmentation:
PPT (9MB), PDF (3MB)
Reading: F&P ch. 14
April 17 Optical flow:
PPT (5MB), PDF (1MB)
Assignment 3 and second progress report due
April 22 Tracking: PPT (18MB), PDF (1.5MB) Reading: F&P ch. 17
April 24 Tracking concluded  
April 29   Final project report due by 5PM - FIRM DEADLINE!

Useful Resources

Tutorials, review materials

General reference

MATLAB reference

The real world

Acknowledgments

The course slides draw on materials generously made publicly available by D. Forsyth, J. Ponce, J. Koenderink, S. Seitz, R. Szeliski, B. Freeman, M. Pollefeys, D. Lowe, K. Grauman, A. Efros, F. Durand, L. Fei-Fei, A. Torralba, R. Fergus (and possibly others whose attributions I either couldn't find or omitted by my own negligence).