COMP 590/776: Computer Vision in 3D World

Spring 2023

Instructor: Soumyadip (Roni) Sengupta
Tuesday and Thursday, 2:00-3:15pm, Brooks (FB) - Room 09

[ Overview | Course Policies | Schedule ]

Image courtesy Computer Vision: Algorithms and Applications, 2nd ed., by Richard Szelski

Overview


Course Description

This is an advanced undergraduate and graduate level course focusing on the fundamentals of Computer Vision. Recently we have noticed an explosion in Computer Vision research and real world applications, creating a huge impact in society. This revolution has been mainly powered by the advances in Deep Learning techniques and computing resources. In this course we will focus on learning the fundamentals of Computer Vision, and we will try to answer the following questions: (a) How do we perceive color and how are images formed? (b) How do we process images and extract features from them? (c) How do we connect the 3D world to 2D images and reconstruct 3D from images? (d) How do we develop machine perception that recognize, detect, and segment objects? We will also study how some Computer Vision algorithms can introduce bias and cause harm to a certain portion of the population, often underrepresented communities, and how as researchers and practitioners we can do better.

Prerequisite

MATH 233 (or any multivariate calculus course) + MATH 347 (or any linear algebra course) + COMP 211 (or COMP 311) + COMP 301 (or COMP 411) + some basic knowledge of probability + coding in Python. There will be a linear algebra, calculus and probability review in one lecture followed by a short assignment. Your ability to handle this assignment will indicate your mathematical preparedness for the rest of the course. Knowledge of deep learning is helpful, but not required.

Resources

Computer Vision: Algorithms and Applications, 2nd ed., by Richard Szelski
Multiple View Geometry in Computer Vision , by Richard Hartley and Andrew Zisserman.
Computer Vision: A Modern Approach, by David Forsyth and Jean Ponce

Personnel

Instructor Office Hours: Thrs 3.30-4.30pm (SN 255).
TA: Misha Shvets. email: mshvets@cs.unc.edu TA Office hours: Tues 3.30-5.00pm (FB 230).
Assignments will be handled via Canvas..

Goal/Student Learning Outcomes

(a) Understand the fundamentals of Computer Vision. (b) Develop implementation expertise by writing codes in Python to solve various Computer Vision tasks. (c) Develop strong mathematical frameworks required for problem solving. (d) Learn to read and write scientific articles and publications.

Course Policies

Grading (for both 590 & 790):

  • Assignment 1: [pen & paper] Linear Algebra and Probability Recap: 5%
  • Assignment 2: [pen & paper + coding in python] Frequency domain image analysis: 10%
  • Assignment 3: [coding in python] Stitching images to generate panorama: 10%
  • Assignment 4: [pen & paper] Fundamental of 3D Vision: 10%
  • Assignment 5: [coding in Google Colab] Training neural networks for image classification, detection, and segmentation: 10%
  • Mid-Term: [pen & paper] in-class mid-term: 25%
  • Final Exam: Write Survey Paper OR Work on a project/demo (both in groups of 3): 25%
  • Class Participation: 5%

Grades for 590 students will be curved differently from 790 students.

Final Exam Details

  • Students should work in a group of 3. 590 & 790 students can be in same group
  • Students will have a choice of either write a thorough survey paper or work on a project and present a demo.
  • Students should register their group and their choice for project/survey paper by Jan 31.
  • If you are unable to form a group 3, you will be randomly assigned a group on Feb 1 along with a particular survey topic.
  • Survey Paper:
    • Some topics will be provided. Students can choose from this list of topics.
    • Students can also propose a new survey topic and justify it.
    • Goal of writing a survey paper is to: (a) provide a thorough summary of existing research works in a field, (b) build your own depth of knowledge in a topic.
  • Project:
    • Some topics will be provided. Students can choose from this list of topics.
    • Students can also propose a new project topic and justify it.
    • A project topic DOES NOT need to involve Research. The project needs to produce nice demo results that should involve multiple existing Computer Vision algorithms.

Misc.

Class Participation: Class participation points will be provided based on attendance (to be conducted randomly in some classes) and for participating in class discussions.

Late Submissions: Late assignments will not be accepted. You lose 1 point for each late day.

Academic Integrity & Collaboration:
  • For your assignments and projects, you are allowed to use materials from external sources as long as it helps you to understand the topic and NOT actually solve the assignment problem.
  • You MUST clearly acknowledge any sources used for solving the assignment.
  • You can discuss and brainstorm in groups but the programming and the solution has to be done on an individual basis.
  • No copying or replicating of other existing solutions.
  • Mid-term is in class. You can bring 1 US Letter size cheat-sheet (1-sided). But collaboration, electronic devices, browsing the web, printed materials etc. are not permitted.

Schedule

Date Topic Details Special Dates
Intro & Review
Tue Jan 10 Welcome + Intro. to Computer Vision Lecture Slide [pptx]
Thrs Jan 12 Maths Review (Linear Algebra, Probability, Calculus) Lecture Slide [pdf] HW1: Maths review (assigned)
Colors & Imaging
Tue Jan 17 Color & Color Spaces Lecture Slide [pdf] HW1: Maths review (due)
Thrs Jan 19 In-Camera Imaging Pipeline Lecture Slide [pdf]
Image Processing
Tue Jan 24 Filtering - Convolution, Gradients, & Edges Lecture Slide [pdf]
Thrs Jan 26 Frequency domain - Fourier Analysis Lecture Slide [pdf] HW2: Image Processing (assigned)
Features
Tue Jan 31 Feature Detection (Corner & Blob) Lecture Slide [pdf]
Thrs Feb 2 Feature Descriptor & Matching (SIFT) Lecture Slide [pdf]
2D Transformation
Tue Feb 7 2D Transformations & Fitting Lecture Slide [pdf] HW2: Image Processing (due)
Thrs Feb 9 RANSAC + Image Blending Lecture Slide [pdf] HW3: Panorama (assigned)
Tue Feb 14 No Class
3D Vision
Thrs Feb 16 Camera Models + Calibration - 1 Lecture Slide [pdf]
Tue Feb 21 Camera Models + Calibration - 2 Lecture Slide [pdf]
Thrs Feb 23 Two-view Geometry Lecture Slide [pdf] HW3: Panorama (due)
Tue Feb 28 Stereo Lecture Slide [pdf] HW 4: 3D Vision (assigned)
Thrs Mar 2 Multi-view Stereo Lecture Slide [pdf]
Tue Mar 7 Structure from Motion Lecture Slide [pdf]
Thrs Mar 9 Light & Photometric Stereo Lecture Slide [pdf] HW 4: 3D Vision (due) - free extension till Mar 14 due to ICCV+MICCAI deadlines
Tue/Thrs Mar 14/16 No Class (Spring break)
Tue Mar 21 Mid-term Review
Thrs Mar 23 MIDTERM (in-class) Syllabus: until Mar 9.
Learning & Perception
Tue Mar 28 Recognition - 1 (intro) Lecture Slide [pdf]
Thrs Mar 30 Recognition - 2 (deep learning) Lecture Slide [pdf]
Tue Apr 4 Object Detection Lecture Slide [pdf] HW 5: Deep Learning (assigned)
Thrs Apr 6 No Class
Tue Apr 11 Segmentation Lecture Slide [pdf]
Thrs Apr 13 NeRFs Lecture Slide [pdf]
Tue Apr 18 Video & Motion
Guest Lectures
Thrs Apr 20 Guest Lecture by Dr. Sara Beery Title: Recognition for Biodiversity
(via Zoom)
HW 5: Deep Learning (due)
Tue Apr 25 Guest Lecture by Dr. Jia-bin Huang Title: How to write a good paper?
(via Zoom)
Thrs Apr 27 Deep learning for 3D Lecture by Mike Wang
(PhD student at UNC)
Fri May 5 FINAL PROJECT/SURVEY No class, online submission only