Structure

We will pick popular papers in Deep Learning/Machine Learning and discuss them.

This year our theme is The Unification of Deep Learning Systems. We start with Understanding Recommendation Systems and why that is the largest workload. NLP is looking more like recommendation systems and growing. Vision is going towards NLP starting last year.

The class will start with an informal short introduction (20 minutes) to the key concepts and contributions. Then we will open the floor to two pre-divided groups. One of them will argue why this paper deserves all the credit and what subtleties of Deep Learning is the paper bringing to the table. The other group will serve as a devil's advocate. Their job is to argue why, based on what was known till the publication of the paper, there is nothing new presented in the paper OR argue why the claims are questionable based on other more recently published works.

An argument is only valid if it is supported by at least one of the following:

- Mathematical Reductions and Equivalence.

- Claims/Arguments/Theorems from published works.

- Experimental data and plots that appeared in published works.

The class aims to introduce graduate students to a rigorous process of forming valid scientific arguments to judge an idea. The hope is that this debate process will serve four purposes:

1. Create a practice of questioning every claim, even made in some of the most famous papers

2. To get a deeper understanding of the paper.

3. Understand what community considers a contribution that will help them formulate their thought process before writing papers,

4. Make them better future reviewers.

The class will be divided randomly into FOR and AGAINST group. Papers will be posted one week in advance. Both the group will jointly submit a 2-page report of their findings and arguments.

Grading and Logistics

For 1 credit: Class and 1-page report participation, sign up for one informal short introduction, and sign up for one discussion summarization. In addition students can undergo a semester long research project for 3 credits. For 3 credits requirements discuss with the instructor.

Prerequisite

A rigorous course in machine learning is required. We will be discussing advanced papers in ML papers every week.

Presentations and Summarization Logistics

Participation in debate along with the 2-page report. In addition, each student should sign up for 1 class to present (2 students per class) and 1 class to summarize the discussions (2 students per class). You cannot summarize the same class that you presented. The summarization should be submitted no later than a week of the presentation.

Please sign-up for scribe and presentation assignment at Google Spreadsheet

Schedule

08/30 : Introduction, Logistics, Familiarity with the Class and Warmup Debates. slide
09/06 : LABOR DAY
09/13: Recommendation Systems at Amazon (DSSM), Microsoft (DSSM), and Facebook (DLRM)

Microsoft: Learning Deep Structured Semantic Models for Web Search using Clickthrough Data paper
Amazon: Semantic Product Search paper
Facebook: Deep Learning Recommendation Model (DLRM) for Personalization and Recommendation Systems paper

09/20 : Question Answering (It is Information Retreival!)

Question Answering with Subgraph Embeddings paper
The concept of learning embedding is quite old. Here is a good paper from 2012: LEARNING TO EMBED SONGS AND TAGS FOR PLAYLIST PREDICTION paper
The famous BERT Paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding paper

09/27 : Alpha Fold (Structured Prediction. Again Embeddings!)

Highly accurate protein structure prediction with AlphaFold paper
Deep Mind Blog paper
Old Paper: Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translationpaper

10/04 Deep Reinforcement Learning:

A Nice Tutorial link
Actor Critic Model in RL pdf
Is RL a fancy supervised regression that predict discounted rewards? blog

10/11 : MIDTERM RECESS
10/18 : Efficient Training and Sparsity

Scalable and Sustainable Deep Learning pdf (early 2016)
High Level Talk on Scalable and Sustainable Deep Learning link
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer pdf (2017)

10/25 : Memorization and Generalization

Wide & Deep Learning for Recommender Systemspdf
Understanding deep learning requires rethinking generalization pdf
The Expressive Power of Neural Networks: A View from the Widthpdf

11/01 : Overparametrization and Implicit Regularization

The Role of Explicit Regularization in Overparameterized Neural Networks video
A good Presentation on this topicpdf
Implicit Regularization in Over-parameterized Neural Networks pdf
Larger Models Trains Faster: EMPIRICALLY CHARACTERIZING OVERPARAMETERIZATION IMPACT ON CONVERGENCEpdf

11/08 : Compressing Neural Networks (TinyAI or TinyML)

A Blog Summarizing several Ideas link
Compressing Neural Network with the Hashing Trick pdf
Chip Design using RL blog
CoCoPIE: Making Mobile AI Sweet As PIE Compression-Compilation Co-Design Goes a Long Way pdf

11/15 : AutoML

Tutorial on AutoMLTutorial
AutoGluon paper

11/22 : Multi-Arm Bandits (Simple RL)

A nice Blog (The Multi-Armed Bandit Problem and Its Solutions) link
Contextual Bandits link
Contextual Bandits is a Simplified Special Case of RL blog1
The weak assumptions of Multi-Arm Bandits and Strong Markovian Assumption for RL blog2

11/29 :

Students with Disability

If you have a documented disability that may affect academic performance, you should: 1) make sure this documentation is on file with Disability Support Services (Allen Center, Room 111 / adarice@rice.edu / x5841) to determine the accommodations you need; and 2) meet with me to discuss your accommodation needs.