CS229T/STATS231: Statistical Learning Theory

Administrative information
Instructor: Tengyu Ma
Course assistants: Office hours: see Google Calendar
Contact: Please use Piazza for questions and discussions.
Course content
Description: When do machine learning algorithms work and why? How do we formalize what it means for an algorithm to learn from data? How do we use mathematical thinking to design better machine learning methods? This course focuses on developing a theoretical understanding of the statistical properties of learning algorithms.
Late policy: Two total late days are accepted for homeworks, paper review, or projects, but not scribe notes. Late work done in pairs with x late days will require x late days from each of the contributors. Under extentuating circumstances, you may request an extension by contacting the course staff.
Collaboration policy: we encourage you to form study groups and discuss homeworks. However, you must write up all homeworks from scratch independently without referring to any notes from the joint session. Please follow the honor code.
Texts and References

There is no required text for the course. A number of useful references:

Schedule (subject to change)
Week 1
  • Mon 09/24: Lecture 1: overview, formulation of prediction problems, error decomposition [Scribe notes]
  • Wed 09/26: Lecture 2: asymptotics of maximum likelihood estimators (MLE) [Scribe notes]
Week 2
  • Mon 10/01: Lecture 3: uniform convergence overview, finite hypothesis class [Scribe notes]
  • Mon 10/01: Homework 1 out
  • Wed 10/03: Lecture 4: naive epsilon-cover argument, concentration inequalities [Scribe notes] [Advanced Reading]
  • Wed 10/03: Homework 0 (warmup) due
Week 3
  • Mon 10/08: Lecture 5: Sub-Gaussian random variables, Rademacher complexity [Scribe notes]
  • Wed 10/10: Lecture 6: Rademacher complexity, margin theory [Scribe notes]
  • Wed 10/10: Homework 1 due
  • Thu 10/11: Homework 2 out
Week 4
  • Mon 10/15: Lecture 7: Rademacher complexity, neural networks [Scribe notes]
  • Wed 10/17: Lecture 8: Margin-based generalization error of two-layer neural networks [Scribe notes]
Week 5
  • Mon 10/22: Lecture 9: VC dimension, covering techniques [Scribe notes]
  • Wed 10/24: Lecture 10: Covering techniques, overview of GANs [Please refer to Percy's notes (page 88-95) for the covering part and Lecture 11 for the overview of GANs]
Week 6
  • Mon 10/29: Lecture 11: Total variation distance, Wasserstein distance, Wasserstein GANs [Scribe notes]
  • Wed 10/31: Lecture 12: Generalization and approximation in Wassersetin GANs [Scribe notes]
  • Thu 11/01: Homework 2 (uniform convergence) due
  • Thu 11/01: Homework 3 out
Week 7
  • Mon 11/05: Lecture 13: Restricted Approximability, overview of online learning [Scribe notes]
  • Wed 11/07: Lecture 14: Online learning, online convex optimization, Follow the Leader (FTL) algorithm [Scribe notes] [Additional Reading (Notes by Haipeng Luo)]
Week 8
  • Mon 11/12: Lecture 15: Follow the Regularized Leader (FTRL) algorithm [Scribe notes]
  • Wed 11/14: Lecture 16: FTRL in concrete problems: online regression & expert problem, convex to linear reduction [Scribe notes]
  • Wed 11/14: Exam (6-10pm, classroom: Bishop Auditorium, Lathrop296)
Week 9
  • Mon 11/26: Lecture 17: Multi-armed bandit problem, general OCO with partial observation [Scribe notes]
  • Wed 11/28: Lecture 18: Multi-armed bandit problem in the stochastic setting [Scribe notes]
  • Wed 11/28: Homework 3 due
Week 10
  • Mon 12/03: Lecture 19: Regret bound for UCB, Bayesian setup, Thompson sampling [Scribe notes]
  • Wed 12/05: Lecture 20: Information theory, regret bound for Thompson Sampling [Scribe notes]
  • Fri 12/07: Paper review due
  • Fri 12/07: Final project due (if you didn't do the paper review)