COMP 441/COMP 552: Large Scale Machine Learning
- Instructor: Anshumali Shrivastava (anshumali AT rice)
- TA: Chen Luo (cl67 AT rice)
- Class Timings: Mon/Wed/Fri 11am-11:50pm
- Location: DCH 1042
- Office Hours: Mon 3-4pm DH 3109 (Chen), Wed 11:50-12:30 DH 3118 (Anshu)
Overview
Learning from large dataset is becoming a ubiquitous phenomena in all applications spanning
robotics, medical decisions, internet, communication, biology, etc. A semester long project based
course designed to give students a thorough grounding in the theory and algorithms
needed for research and practical applications in machine learning for modern massive datasets.
Topics draw from machine learning, classical statistics, algorithms and information theory.
Grading
Prerequisite
Familiarity with basics in linear algebra, probability, and machine learning is required.
Schedule
- 01/09 : Introduction, Logistics, Mark and Re-capture Estimation.(slides)(scribe)
- 01/11 : Basic Sampling and Tail Inequalities. (scribe)
- 01/13 : Large-Scale Search: Introduction, Inverted Index. (slides)
- 01/16 : MARTIN LUTHER KING, JR. DAY (HOLIDAY - NO SCHEDULED CLASSES)
- 01/18 : Large-Scale Search: Locality Sensitive Hashing (LSH) (scribe)
- 01/20 : Minwise Hashing, Duplicate Detection and Clustering the Web (scribe)
- 01/23 : Some Adventures with Probabilistic Hashing
- 01/25 : Universal Hashing
- 01/27 : Sketching: Bloom Filters.
- 01/30 : Sketching: Count-Min Sketch and Count-Sketch
- 01/01 : Johnson-Lindenstrauss Lemma and Random Projections.
- 02/03 : Large-Scale SVD using Random Projections.
- 02/06 : SVMs and Kernels.
- 02/08 : Kernel Features.
- 02/10 : SPRING RECESS NO CLASS
- 02/13 : Guest Lecture: Word Embeddings
- 02/15 : Metric/Kernel Learning
- 02/17 : Feature Hashing and LSH based Random Features
- 02/20 : Convexity.
- 02/22 : Gradient Descent and Convergence.
- 02/24 : Variants of Gradient Descent.
- 02/27 : Stochastic Gradient Descent and Recent Advances
- 03/01 : Recommender Systems: Basic Neighborhood Models
- 03/03 : Recommender Systems: Factorization Models
- 03/06-08 : MidTerm Project Presentations
- 03/10 : Large-Scale Matrix Factorizations
- 03/13-17: SPRING BREAK
- 03/20 : Submodular Optimization 1
- 03/22 : Submodular Optimization 2
- 03/24 : Training Massive Deep Networks
- 03/27 : Compute and Memory Efficient Deep Networks 1
- 03/29 : Compute and Memory Efficient Deep Networks 2
- 03/31 : Page Rank in Graphs
- 04/03 : Mining Massive Graphs 1
- 04/05 : Mining Massive Graphs 2
- 04/07 : Active Learning
- 04/10 : Crowd Sourcing and Humans In the Loop Learning
- 04/12 : A/B Testing
- 04/14 : Online Learning and Weighted Majority Algorithm
- 04/17 : Multi-arm Bandits and UCB Algorithm
- 04/19-21 : Final Project Presentations
Students with Disability
If you have a documented disability that may affect academic performance, you should: 1) make sure this documentation is on file with Disability Support Services (Allen Center, Room 111 / adarice@rice.edu / x5841) to determine the accommodations you need; and 2) meet with me to discuss your accommodation needs.