COMP 640: Graduate Seminar in Machine Learning
- Instructor: Anshumali Shrivastava (anshumali AT rice.edu)
- Class Timings: Monday 4pm - 5:35pm
- Location:Duncan Hall 1046
- Office Hours: Monday 3:00pm - 4:00pm
Structure
We will pick popular papers in Deep Learning/Machine Learning and discuss them.
This year our theme is The Unification of Deep Learning Systems. We start with Understanding Recommendation Systems and why that is the largest workload. NLP is looking more like recommendation systems and growing. Vision is going towards NLP starting last year.
The class will start with an informal short introduction (20 minutes) to the key concepts and contributions. Then we will open the floor to two pre-divided groups. One of them will argue why this paper deserves all the credit and what subtleties of Deep Learning is the paper bringing to the table. The other group will serve as a devil's advocate. Their job is to argue why, based on what was known till the publication of the paper, there is nothing new presented in the paper OR argue why the claims are questionable based on other more recently published works.
An argument is only valid if it is supported by at least one of the following:
- Mathematical Reductions and Equivalence.
- Claims/Arguments/Theorems from published works.
- Experimental data and plots that appeared in published works.
The class aims to introduce graduate students to a rigorous process of forming valid scientific arguments to judge an idea. The hope is that this debate process will serve four purposes:
1. Create a practice of questioning every claim, even made in some of the most famous papers
2. To get a deeper understanding of the paper.
3. Understand what community considers a contribution that will help them formulate their thought process before writing papers,
4. Make them better future reviewers.
The class will be divided randomly into FOR and AGAINST group. Papers will be posted one week in advance. Both the group will jointly submit a 2-page report of their findings and arguments.
Grading and Logistics
For 1 credit: Class and 1-page report participation, sign up for one informal short introduction, and sign up for one discussion summarization. In addition students can undergo a semester long research project for 3 credits. For 3 credits requirements discuss with the instructor.
Prerequisite
A rigorous course in machine learning is required. We will be discussing advanced papers in ML papers every week.
Presentations and Summarization Logistics
Participation in debate along with the 2-page report.
In addition, each student should sign up for 1 class to present (2 students per class) and 1 class to summarize the discussions (2 students per class). You cannot summarize the same class that you presented. The summarization should be submitted no later than a week of the presentation.
Please sign-up for scribe and presentation assignment at Google Spreadsheet
Schedule
- 08/30 : Introduction, Logistics, Familiarity with the Class and Warmup Debates. slide
- 09/06 : LABOR DAY
- 09/13: Recommendation Systems at Amazon (DSSM), Microsoft (DSSM), and Facebook (DLRM)
- Microsoft: Learning Deep Structured Semantic Models for Web Search using Clickthrough Data paper
- Amazon: Semantic Product Search paper
- Facebook: Deep Learning Recommendation Model (DLRM) for Personalization and Recommendation Systems paper
- 09/20 : Question Answering (It is Information Retreival!)
- Question Answering with Subgraph Embeddings paper
- The concept of learning embedding is quite old. Here is a good paper from 2012: LEARNING TO EMBED SONGS AND TAGS FOR PLAYLIST PREDICTION paper
- The famous BERT Paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding paper
- 09/27 : Alpha Fold (Structured Prediction. Again Embeddings!)
- Highly accurate protein structure prediction with AlphaFold paper
- Deep Mind Blog paper
- Old Paper: Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translationpaper
- 10/04 Deep Reinforcement Learning:
- A Nice Tutorial link
- Actor Critic Model in RL pdf
- Is RL a fancy supervised regression that predict discounted rewards? blog
- 10/11 : MIDTERM RECESS
- 10/18 : Efficient Training and Sparsity
- Scalable and Sustainable Deep Learning pdf (early 2016)
- High Level Talk on Scalable and Sustainable Deep Learning link
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer pdf (2017)
- 10/25 : Memorization and Generalization
- Wide & Deep Learning for Recommender Systemspdf
- Understanding deep learning requires rethinking generalization pdf
- The Expressive Power of Neural Networks: A View from the Widthpdf
- 11/01 : Overparametrization and Implicit Regularization
- The Role of Explicit Regularization in Overparameterized Neural Networks video
- A good Presentation on this topicpdf
- Implicit Regularization in Over-parameterized Neural Networks pdf
- Larger Models Trains Faster: EMPIRICALLY CHARACTERIZING OVERPARAMETERIZATION IMPACT ON CONVERGENCEpdf
- 11/08 : Compressing Neural Networks (TinyAI or TinyML)
- A Blog Summarizing several Ideas link
- Compressing Neural Network with the Hashing Trick pdf
- Chip Design using RL blog
- CoCoPIE: Making Mobile AI Sweet As PIE Compression-Compilation Co-Design Goes a Long Way pdf
- 11/15 : AutoML
- 11/22 : Multi-Arm Bandits (Simple RL)
- A nice Blog (The Multi-Armed Bandit Problem and Its Solutions) link
- Contextual Bandits link
- Contextual Bandits is a Simplified Special Case of RL blog1
- The weak assumptions of Multi-Arm Bandits and Strong Markovian Assumption for RL blog2
- 11/29 :
Students with Disability
If you have a documented disability that may affect academic performance, you should: 1) make sure this documentation is on file with Disability Support Services (Allen Center, Room 111 / adarice@rice.edu / x5841) to determine the accommodations you need; and 2) meet with me to discuss your accommodation needs.