Visual recognition and language understanding are two challenging tasks in artificial intelligence. In this course we will study and acquire the skills to build machine learning and deep learning models that can reason about images and text. Particularly, we will study a recent body of research at the intersection of vision and language including: generating image descriptions using natural language, visual question answering, image retrieval using complex text queries, learning from weakly supervised text, aligning images and text in large data collections, generating images from textual descriptions, video and language, and other related topics. On the technical side we will be studying models including bag-of-words, n-gram language models, neural language models, probabilistic graphical models (PGMs), recurrent neural networks (RNNs), long-short term memory networks (LSTMs), convolutional neural networks (Convnets), and memory networks.
Prerrequisites: Any one of the following courses: Machine Learning, Computational Visual Recognition, Natural Language Processing. In summary, this should not be your introductory course to Machine Learning. This will instead be a research-oriented course targeted to PhD Students and MS students with research interests. If you think you have taken a course with a strong Machine Learning component let me know so I can better advise you.
Additional Requirement: In this class you will need a GPU to execute the code in some of your assignments and to develop a course project which will be an important part of your grade. Therefore you need access ideally to a GTX 1080 GPU, or better graphics card, at the minimum you should have access to a CUDA-capable card with at least 4GB of GPU RAM. Additionally, you can fulfill this requirement by learning how to use Amazon EC2 Instances, G2 and P2 instances have GPUs, more info here. As a student, you can get a limited number of free computing credits from the AWS Educate program, however you have to be careful about not running out of your budget using these cloud instances. Note: The CS Department also has three machines named cuda1-3 which you can SSH from any CS server and has five machines with NVIDIA K20's (artemis1-5) in a cluster (instructions). These nodes might be enough for some assignments but probably not for your project. You are encouraged to use them but they do not fullfil the requirement outlined here.
Date | Topic | ||
Thursday, January 19th. | Lecture: Introduction [slides]
|
||
Tuesday, January 24th | Lecture: Introduction to Computer Vision [slides]
| Recommended Reading: Szeliski, Chapter 3.1-3.2 Link to online book [here]. |
|
Thursday, January 26th | Lecture: Introduction to Natural Language Processing [slides]
| ||
Tuesday, January 31sth | Lecture: Introduction to Deep Learning [slides]
| ||
Thursday, February 2nd | Lecture: Convolutional Neural Networks I [slides]
| ||
Tuesday, February 7th | Lecture: Convolutional Neural Networks II [slides]
|
||
Thursday, February 9th | Lecture: Recurrent Neural Networks I [slides]
| ||
Tuesday, February 14th | Lecture: Recurrent Neural Networks II / Word Embeddings [slides]
|
||
Thursday, February 16th | Lecture: Deep-learning based Visual Recognition [slides]
| ||
Tuesday, February 21st | Course Project Proposal Presentations
|
Thursday, February 23rd | Student Paper Review: Webly-supervised learning |
Presenters:
Ji, Fandi [slides]
Leandra [slides]
|
Submit a 1 or 2 page project proposal in PDF (Deadline Monday February 27th). | |||
Tuesday, February 28th | Student Paper Review: Generating Image Descriptions
|
Presenters:
Jieyu [slides]
Brady, Kerry [slides]
|
|
Thursday, March 2nd | Student Paper Review: Video and Text
|
Presenters:
Nick, Tom [slides]
Abhimanyu, Gautam [slides]
|
|
Before Spring Break meet with the instructor to confirm your Final Project Proposal Note: If the instructor approved your project without objections on UVACollab then no need to meet, you can start working on your project. | |||
Tuesday, March 7th | Spring Break - no classes this day. | ||
Thursday, March 9th | Spring Break - no classes this day. | ||
Tuesday, March 14th | Student Paper Review: Visual Question Answering | Presenters:
Yujia, Luyao []
Anudeep []
|
|
Thursday, March 16th | ICCV Deadline - no classes this day. | ||
Tuesday, March 21st | Student Paper Review: Attention Models and Recurrent Neural Networks
|
Presenters:
Siva, Haina [slides]
Arshdeep, Sihang []
|
|
Thursday, March 23th | Student Paper Review: Memory-augmented Networks | Presenters:
Paola, Vicente [slides]
Erik, Leigh []
|
|
Tuesday, March 28th | Lecture: Vicente
| ||
Thursday, March 30th | Student Paper Review: Topic of your choice | Presenters:
Wasi, Monica [slides]
Masudur []
|
|
Tuesday, April 4th | Student Paper Review: Text to Image Generation.
|
Presenters:
Cara []
Seth, Colin [slides]
|
|
Thursday, April 6th | Student Paper Review: Referring Expressions. | Presenters:
Qingyu, Mengyao []
Tianyi [slides]
|
|
Thursday, April 6th 11:59pm: Submit a 2-page report on UVA Collab detailing progress in your Project (Describe the dataset, experiments, or preliminary results that you might have so far, and state clearly what is left to be completed). I will provide feedback but no grade, use this opportunity to show me your progress. Use this template. | |||
Tuesday, April 11th | Student Paper Review: Topic of your choice
|
Presenters:
Leo [slides] Yang, Angyang []
|
|
Thursday, April 13th | Brainstorming Session for New Tasks & Ideas. | ||
Tuesday, April 18th | Course re-cap
| ||
Thursday, April 20th | In-class Activity | ||
Tuesday, April 25th | Project Presentations | ||
Thursday, April 27th | Project Presentations | ||
Tuesday, May 2nd | Project Presentations | ||
Project Deadline - Submit Report on UVA Collab (May 2nd 11:59PM EST) Use the following format for your report: template. Minimum 4 pages, maximum 6 pages. |