CS 6476: Computer Vision, Fall 2019
College of Computing, Georgia Tech

Announcements
- [New] Please make a note of the change in venue for the lectures. Starting 8/27 (Tue), the class is going to be at MoSE (Molecular Sciences and Engineering) G011.
- Do not forget to register for the class on Piazza using this sign-up link. 1% of the class participation grade is reserved for signing up on Piazza!
- Welcome to the course!
"To see is to know what is where by looking" -- David Marr
Over the past few decades, machines have come a long way in their ability to "see". Some examples are autonomous navigators such as self-driving cars, medical imaging technologies, image search engines, face detection and recognition systems in apps, aids for the visually impaired, control-free video games, and industrial automation systems.
In this introductory Computer Vision course, we will learn how to "teach machines to see". We will explore several fundamental concepts including image formation, feature detection, segmentation, multiple view geometry, recognition, and video processing. We will use these concepts to build applications that aid machines to see the world around them.
Prerequisites
No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:
- Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
- Programming: Assignments are to be completed and graded either in Python or Matlab.
- Math: Linear algebra, vector calculus, and probability. Linear algebra is the most important and students who have not taken a linear algebra course have struggled in the past.
Teaching Staff

parikh@gatech.edu

samyak@gatech.edu

akumar624@gatech.edu

bhoomichheda@gatech.edu
(most Tues, Thu 1:15-1:45pm)
(CCB, level 1 common area)
(CCB, level 1 common area)
(CCB, level 1 common area)

patrick.grady@gatech.edu

arjun.majumdar@gatech.edu

kunalchawla@gatech.edu
(CCB, level 1 common area)
(CCB, level 1 common area)
(CCB, level 1 common area)
Deliverables
Problem Sets (65% of final grade): You will be given 6 problem sets, one approximately every two weeks. These will involve a combination of conceptual questions and programming problems. The programming problems will provide hands-on experience working with techniques covered in or related to the lectures. All code and written responses must be completed individually and submitted to Canvas. Most problem sets will take significant time to complete. Please start early. Problem Set 0 (PS0) will be worth 5% the final grade, and the remaining 5 problem sets will be worth 12% each.
Project (30% of final grade): Your project can be about applying any of the techniques we studied in class to real world problems. You can also extend a technique, or empirically analyze it. Comparisons between two approaches are also welcome. It is wonderful if you design and evaluate a novel approach to an important existing or new vision problem. Be creative! Students are allowed to use existing code for their projects. However, do make sure that you add proper references/citations to any existing code that you might use (for e.g. links to GitHub repos etc.). Also, students are expected to do a substantial amount of work to build on top of the existing implementations (if used) and clearly delineate the scope of what was available v/s what was done by them as part of their project updates. You must work in teams of 3-5. Students should maintain a nice, professional looking, visual, self-contained webpage describing their project. We will link to all project pages from the class webpage. The following are deliverables for your project. All the deliverables (including the proposal) are to be submitted via the project web-page. The webpage source files should be added to a ZIP folder and uploaded to Canvas.
-
Proposal (20% of project grade): A description of the following (to be submitted via the project web-page):
- Problem statement: Clearly state the goal of your project. When someone uses your system, what is the expected input to the system, and what is the desired output?
- Approach: Describe the technical approach you plan to employ.
- Experiments and results: Describe the experimental setup you will follow, which datasets you will use, which existing code you will exploit, what you will implement yourself, and what you would define as a success for the project. If you plan on collecting your own data, describe what data collection protocol you will follow. Provide a list of experiments you will perform. Describe what you expect the experiments to reveal, or what is uncertain about the potential outcomes.
-
Two Project Updates (50% of project grade, 25% each):
There will be two updates: a mid-term and a final update (both to be submitted via the project web-page). Here is an outline of what the project web-page is supposed to cover.
- Abstract: One or two sentences on the motivation behind the problem you are solving. One or two sentences describing the approach you took. One or two sentences on the main result you obtained.
- Teaser figure: A figure that conveys the main idea behind the project or the main application being addressed.
- Introduction: Motivation behind the problem you are solving, what applications it has, any brief background on the particular domain you are working in (if not regular RBG photographs), etc. If you are using a new way to solve an existing problem, briefly mention and describe the existing approaches and tell us how your approach is new.
- Approach: Describe very clearly and systematically your approach to solve the problem. Tell us exactly what existing implementations you used to build your system. Tell us what obstacles you faced and how you addressed them. Justify any design choices or judgment calls you made in your approach.
- Experiments and results: Provide details about the experimental set up (number of images/videos, number of datasets you experimented with, train/test split if you used machine learning algorithms, etc.). Describe the evaluation metrics you used to evaluate how well your approach is working. Include clear figures and tables, as well as illustrative qualitative examples if appropriate. Be sure to include obvious baselines to see if your approach is doing better than a naive approach (e.g. for classification accuracy, how well would a classifier do that made random decisions?). Also discuss any parameters of your algorithms, and tell us how you set the values of those parameters. You can also show us how the performance varies as you change those parameter values. Be sure to discuss any trends you see in your results, and explain why these trends make sense. Are the results as expected? Why?
- Qualitative results: Show several visual examples of inputs/outputs of your system (success cases and failures) that help us better understand your approach.
- Conclusion and future work: Conclusion would likely make the same points as the abstract. Discuss any future ideas you have to make your approach better.
- References: List out all the references you have used for your work.
Here are some examples for your reference.- See this for a webpage template.
- See this for an example of a nice, professional looking page.
- See this for an example of how to lay out the various details of your project. You may need to provide more details than this, because you will not be submitting an associated paper to accompany the webpage. So the page should be self-contained.
- Project Video (30% of project grade): Teams will prepare a 1 min. YouTube video summarizing the project. The video is a teaser to convey the main points, and gain the viewer's interest in wanting to know more. It should be understandable by anyone familiar with Computer Vision. The YouTube link to the video will be submitted as an assignment via Canvas. Here are some example videos for your reference.
Participation and attendance (5% of final grade): Participation in class and regular attendance is expected. If for whatever reason you are absent, it is your responsibility to find out what you missed that day. Also, 1% of the participation grade is reserved for enrolling for the class on Piazza.
Due Dates: All problem sets/project deliverables are to be submitted by the due date noted on the schedule below. Deadlines are firm. If Canvas marks your submission as late, it will be treated as such. Please plan your submissions/uploads keeping this in mind.
Late Day policy: Every student gets two types of late days: 4 individual late days for problem sets and 2 project late days for project-related deliverables. The 2 project late days are for the whole team and not per team member.
The individual late day allowance can be used to accrue up to 4 days in late problem set submissions without any penalty. For example, you could submit one problem set 4 days late or 2 problem sets each 2 days late, and so on. Once you have used all your individual late days, a late problem set submission will be awarded 0 credit. Please plan ahead so you can spend your late days wisely. In particular, note that we expect you will find the earlier problem sets easier than those later in the course.
The 2 project late days can only be used by a project team as a whole — i.e., it can be used by the project team for project-related deliverables. These cannot be used by individual students for problem sets. Similarly, students cannot use their individual late days for project related deliverables.
Audit policy: If you wish to audit the course, you must obtain at least 50% overall score on the assignments.
Textbook: Computer Vision: Algorithms and Applications, by Rick Szeliski. An electronic copy is available free online here. Some background reading on object recognition is from Kristen Grauman and Bastian Leibe's short book on Visual Object Recognition.
Schedule (tentative)
Note that all deliverables are due at 11:58:59 pm on their respective due dates as mentioned on the schedule. If Canvas marks your submission as late, it will be treated as such.
Date | Topic | Readings and Links | Lectures | Deliverables |
---|---|---|---|---|
Tue, 8/20 | Course Intro | Sec 1.1-1.3 | Intro [ ppt ] | PS0 out |
Thu, 8/22 | Features and Filters | Sec 3.1.1-2, 3.2 | Linear Filters [ ppt ] | |
Mon, 8/26 | ||||
Tue, 8/27 | Sec 3.2.3, 4.2 | Gradients [ ppt ] | ||
Wed, 8/28 | PS0 due (extended!) | |||
Thu, 8/29 | Sec 3.3.2-4 | Edges and binary image analysis [ ppt ] | PS1 out | |
Tue, 9/3 | Sec 10.5 | Texture [ ppt ] | ||
Thu, 9/5 | Sec 2.3.2 | Color [ ppt ] | ||
Tue, 9/10 | Grouping and Fitting | Sec 5.2-5.4 | Segmentation and Clustering [ ppt ] | |
Wed, 9/11 | PS1 due | |||
Thu, 9/12 | Sec 4.3.2 | Hough Transform [ ppt ] | PS2 out | |
Tue, 9/17 | Sec 5.1.1 | Deformable Contours [ ppt ] | ||
Thu, 9/19 | Sec 2.1.1, 2.1.2, 6.1.1 |
Alignment and 2D image transformations [ ppt ] |
||
Tue, 9/24 | Multiple views and motion | Sec 3.6.1, 6.1.4 |
Homography and image warping [
ppt
]
Notes on homography matrix |
|
Wed, 9/25 | PS2 due | |||
Thu, 9/26 | Sec 4.1 | Local invariant features (Part 1) [ ppt ] | ||
Tue, 10/1 | Sec 4.1 | Local invariant features (Part 2) [ ppt ] | PS3 out | |
Thu, 10/3 | Sec 11.1.1, 11.2-11.5 | Image Formation [ ppt ] | ||
Tue, 10/8 | Sec 11.1.1, 11.2-11.5 | Epipolar Geometry and Stereo [ ppt ] | ||
Wed, 10/9 | Project proposal due | |||
Thu, 10/10 | Structure from Motion [ ppt ] | |||
Tue, 10/15 | No class (Fall Break) | |||
Wed, 10/16 | PS3 due | |||
Thu, 10/17 | Recognition | Sec 14.3 | Indexing local features and instance recognition [ ppt ] | |
Tue, 10/22 | Sec 14.1 | Intro to category recognition [ ppt ] | ||
Thu, 10/24 | Sec 14.1 | Face Detection [ ppt ] | PS4 out | |
Tue, 10/29 | No class: Work on your projects | |||
Thu, 10/31 | No class: Work on your projects | Mid-term project update due | ||
Tue, 11/5 | Sec 14.3 | Discriminative classifiers for image recognition [ ppt ] | ||
Wed, 11/6 | PS4 due | |||
Thu, 11/7 | Sec 14.3 | Part-based models [ ppt ] | ||
Tue, 11/12 | Face Recognition [ ppt ] | PS5 out | ||
Thu, 11/14 | No class | |||
Tue, 11/19 | Video Processing | Sec 8.4,12.6.4 | Motion and optical flow [ ppt ] | |
Thu, 11/21 | Sec 8.4,12.6.4 | Background subtraction, action recognition [ ppt ] | ||
Tue, 11/26 | Sec 5.1.2, 4.1.4 | Tracking [ ppt ] | ||
Wed, 11/27 | PS5 due | |||
Thu, 11/28 | No class (Thanksgiving Break) | |||
Tue, 12/3 |
On What's Possible Today [
ppt
]
(Guest Lecture: Harsh Agrawal) |
Final project update due Project video due |
References
- Course webpage for a previous offering of the course.
- This course closely follows the following course: CS 376: Computer Vision at UT Austin taught by Kristen Grauman
- A few other similar courses (by no means an exhaustive list):
Acknowledgments
Thanks to visualdialog.org for the webpage format.