How can we use computers to recognize objects, people, actions, animals, places, etc from images? This seemingly trivial task that people perform without much effort has remained one of the core problems in Computer Vision. In this class we will study, play with, and implement algorithms for computational visual recognition using machine learning and deep learning. The class sessions will consists of lectures by the instructor for the most foundational topics, and several student-led paper review sessions to study more recent developments. After this class you will be able to use computational visual recognition for problems ranging from classifying images, to detecting and outlining every object in an image. In summary, after successful completion of this course you should be able to teach a robot how to distinguish dogs from cats.
Prerrequisites: This course requires no previous background in computer vision or machine learning but knowledge in either of those will be helpful. You need to know about matrices, calculating derivatives, and probabilities (bayes rule). You will also need to be at least a moderately proficient programmer in python. There will be several lab assignments. These assignments will show you the basics of modern general visual recognition algorithms and models, and will give you the tools for implementing more advanced ones. There will also be a couple of quizzes directly related to the assignments and material covered during class. Finally, we will have a class project where you will be able to work on something beyond your assignments and where you will have more freedom to pursue a focused problem that is of your interest and better matches your background. Finally we will be using python/pytorch in the lecture notes, so being proficient in Python by completing a few projects in this language before the class starts is helpful. You should install python, jupyter, and pytorch, and complete the following notebook before our first day of class [pytorch_tensors].
Grading: Labs: 30pts (Lab-1: 5pts, Lab-2: 5pts, Lab-3: 10pts, Lab-4: 10pts), Paper presentation and summaries: 10pts, Quiz: 20pts, Project: 40pts.Date | Topic | ||
Tues, August 22th | Lecture: Introduction [slides]
|
||
Thurs, August 24th | Lecture: Image Processing Basics & Image Features [slides]
| Reading: Szeliski Book, Chapter 3. | |
Tues, August 29th | Lecture: Machine Learning for Vision I [slides]
| ||
Thurs, August 31st | Lecture: Machine Learning for Vision II [slides]
| ||
Tues, September 5th | Lecture: Deep Learning for Vision I [no slides, only chalkboard]
| ||
Thurs, September 7th | TA Lecture: Categorization and the Perceptron Model
| ||
Tues, September 12th | No class this day -- Please use this time to work on your Lab. | ||
Thurs, September 14th | Lecture: Deep Learning for Vision II [some slides, mostly chalkboard]
| Supplementary Reading: Neural Networks by Steve Renals | |
Tues, September 19th | Lecture: Deep Learning for Vision III: Intro to Convnets [slides]
|
||
Thurs, September 21st | Lecture: Deep Learning for Vision IV: Classification [slides]
| Extra readings: [Alexnet paper], [VGG-16 slides] [VGG-16 paper] |
|
Tues, September 26th | Lecture: Deep Learning for Vision V: Detection [slides]
| Extra readings: [GoogLeNet], [ResNet] |
|
Thurs, September 28th | Lecture: Deep Learning for Vision VI: Segmentation [slides]
| Extra readings: [R-CNN],
[Fast-RCNN], [Faster-RCNN], [FCNs] |
|
Tues, October 3rd | No class this day -- Reading Days / Fall Break. | ||
Thurs, October 5th | Lecture: Deep Learning for Vision VII [see previously posted slides / chalkboard / in-class python demonstration]
| ||
Submit a 1 or 2 page project proposal in PDF on UVA Collab (Deadline: Thursday October 5th at 5pm). | |||
Tues, October 10th | Lecture: Deep Learning for Vision VIII [slides]
|
Extra Reading: [Image-Captioning], [Question-Answering] |
|
Thurs, October 12th | Lecture: Deep Learning for Vision IX
|
Extra Reading: [Generative Adversarial Networks]
|
|
Tues, October 17th | Lecture: Generative Adversarial Networks [slides]
|
|
|
Thurs, October 19th | Student Paper Review: Style-transfer Models
|
|
|
Tues, October 24th | Student Paper Review: Unsupervised learning of Deep Neural Networks. | ||
Thurs, October 26th | Student Paper Review: Recent Advances in Generative Adversarial Networks
| ||
Tues, October 31st | Student Paper Review: People Recognition | ||
Submit a 2 or 3 page project progress report in PDF on UVA Collab (Deadline: Tuesday October 31st at 5pm). Use this template. | |||
Thurs, November 2nd | In-Class Activity: Quiz Preparation. |
||
Tues, November 7th | In-Class Activity: Quiz Preparation II Quiz Preparation Review | ||
Thurs, November 9th | Student Paper Review: Motion, Tracking, and Video | ||
Tues, November 14th | No class this day -- Please use this time to work on your projects, and attend the talk on Friday by Prof. Yuille instead. | ||
Thurs, November 16th | Lecture: Course Recap
| ||
Friday November 17th, Attend Talk by Prof. Alan Yuille at 11:00AM in Monroe Hall 130.
Speaker: Prof. Alan Yuille (Johns Hopkins University / MIT's Center for Brain, Minds and Machines)
Talk: Representing Objects by Binary Visual Concepts Encoding More information: Prof. Yuille, who obtained his PhD in theoretical physics under Prof. Stephen Hawking in Cambridge, will talk about his research on models that could be better than deep neural networks and rely on binary representations. See more details in attached poster here. Invite more people! |
|||
Tues, November 21st | Quiz (20 pts) | ||
Thurs, November 23rd | Thanksgiving recess - no classes this day. | ||
Tues, November 28th | Project Presentations | ||
Thurs, November 30th | Project Presentations | ||
Tues, December 5th | Project Presentations | ||
Submit a 4 to 5 page Final project report in PDF on UVA Collab + Link to your code (Deadline: Thursday December 5th). Use this template. |
"The School of Engineering and Applied Science relies upon and cherishes its community of trust. We firmly endorse, uphold, and embrace the University’s Honor principle that students will not lie, cheat, or steal, nor shall they tolerate those who do. We recognize that even one honor infraction can destroy an exemplary reputation that has taken years to build. Acting in a manner consistent with the principles of honor will benefit every member of the community both while enrolled in the Engineering School and in the future. Students are expected to be familiar with the university honor code, including the section on academic fraud."
Instructor's Note: In this class particularly, lab assignments are individual. You can still discuss them in a group or with your friends but you should not be straight up copying somebody else's solution or code. Not even a single line of code. You might be tempted to think, well, in how many ways could I write c = c + c * c - 2? You are probably right but what if that's actually an espectacularly wrong solution, and only two students turn a solution with this unlikely expression on it? If there are two assignments where I notice something even slightly as suspicious as this, I, the instructor, Vicente, will refer the case to the Honor Code system where the outcome, if the academic misconduct is proven, will probably be a harsh dismissal from the university. Also, do not try to get solutions from the previous versions of this class, I keep those solutions on file and I am good at remembering code I have seen before. The UVA Honor Code system is harsh indeed, there are not many possible outcomes as in other systems. I strongly advise you not to do anything bad. It is not worth it. Most of the grade in this course will be the course project in any case. Not turning in a lab assignment is much preferrable than turning in something that contains academic misconduct. Beyond the possible academic consequences that this might entail, it will be incredibly dissappointing to me if I find any traces of this in lab assignments. Be clear about what are your original contributions in the class project, and enjoy doing the work on your lab assignments. So let's just all enjoy the class, and avoid this.