CS229T/STATS231: Statistical Learning Theory
Stanford / Autumn 2018-2019
- 12/08: Homework 3 Solutions have been posted!
- 11/26: exam2018-solutions have been posted!
- 11/08: Homework 2 Solutions have been posted!
- 11/02: Homework 3 is out! It's due on Wednesday, 11/28, 11pm.
- 11/01: exam2013-solution,
exam2014-solution, exam2015-solution, exam2016-solution have been posted!
- 10/17: Homework 1 Solutions have been posted!
- 10/14: Homework 2 is out! It's due on Thursday, 11/1, 11pm.
- 10/08: Homework 0 Solutions have been posted!
- 10/02: Homework 1 is out! It's due on Wednesday, 10/10, 11pm.
- 9/29: some backgrounds on linear algebra, optimization, and probability. Note: all the mathematical statements in the document can be cited in the homework solutions without proofs.
- 9/22: Homework 0 is out! It's due on Wednesday, 10/03, 11pm.
- 9/8: Welcome to CS229T/STATS231! Previous years' home pages are here and here for reference. (Currently this page is still under construction.)
- Yu Bai (head CA)
- Tum Chaturapruek
- Jim Zhiyuan Li
- Luigi Nardi
- Colin Wei
Office hours: see Google Calendar
Please use Piazza
for questions and discussions.
When do machine learning algorithms work and why? How do we formalize what it means for an algorithm to learn from data? How do we use mathematical thinking to design better machine learning methods?
This course focuses on developing a theoretical
understanding of the statistical properties of learning algorithms.
- Uniform convergence (VC dimension, Rademacher complexity, etc)
- Implicit/algorithmic regularization, generalization theory for neural networks
- Kernel methods
- Online learning and bandits problems
- Unsupervised learning: exponential family, method of moments, statistical theory of GANs
- A solid background in
and general ability to do mathematical proofs
- Machine learning (CS229) or statistics (STATS315A)
- Convex optimization (EE364A) is recommended
- Homeworks (40%): there will be three homeworks (plus a warmup which does not count towards your grade),
centered around proving properties of statistical procedures.
Each homework must be submitted through Gradescope. Sign up for the course using entry code M4V34N.
You are encouraged to use LaTeX to writeup your homeworks
(here's a template), but this is not a requirement. You will receive
one (1) bonus point for submitting a typed written assignment (e.g. LaTeX, Microsoft Word).
We will accept scanned handwritten assignments but they will not receive the bonus point.
- Exam (25%): open-book, open-notes.
Problems will be like the homeworks, but simpler.
You can use laptops as long as you turn off the wireless.
Date: Wed Nov 14, 6-10 PM, Bishop Auditorium, Lathrop296
- Paper review (30%):
you will write a 2-4 page review of papers. The goal is to learn to
read technically demanding papers critically, and hopefully in the process, generate
novel research ideas. Your review should not only summarize the main result of the paper,
but critique it, instantiate it on examples, discuss its overall significance,
and suggest possible future directions.
See this Google doc
for detailed guidelines and a list of papers. The paper reviews can be done in pairs. Paper reviews that are done in pairs will be evaluated with a slightly higher bar, and they ideally should contain reviews for two closely-related papers and are allowed two additional pages.
Appendix or references beyond the page limit are allowed, but you will not be graded based on them.
Instead of doing the paper review, with approval from the course
staff on the project topic, you can do a final project. Please come to the Tengyu Ma or Yu Bai's office hours to request the approval by briefly describing the project plan. We don't encourage you to do the project unless you own research area is closely related to machine learning theory. The project can be done in pairs. The page limit for project report is 8 pages, not including reference or appendix.
The review and the project should be submitted electronically by 11pm.
- Scribe notes (5%):
Because there is no textbook or set of readings that perfectly fits this course, you will be asked to scribe a note for a lecture in LaTeX. The course staff will select one note for each lecture and share it with other students. 1% bonus credit will be given if your note is selected for posting. See this Google doc for the detailed guidelines. The scribe notes are due 2 days after the lecture (11pm Wed for Mon lecture, and Fri 11pm for Wed lecture). Please sign up here before Sept 29th and plan the time ahead. Extra credits will be given to the notes that are selected for posting.
The scribe notes can be done in pairs.
Late policy: Two total late days are accepted for homeworks, paper review, or projects, but not scribe notes. Late work done in pairs with x late days will require x late days from each of the contributors. Under extentuating circumstances, you may request an extension by contacting the course staff.
we encourage you to form study groups and
discuss homeworks. However, you must write up all homeworks from scratch independently without referring to any notes from the joint session.
Please follow the honor code
There is no required text for the course. A number of useful references:
(subject to change)
- Mon 09/24: Lecture 1: overview, formulation of prediction
problems, error decomposition [Scribe notes]
- Wed 09/26: Lecture 2: asymptotics of maximum likelihood estimators (MLE) [Scribe notes]
- Mon 10/01: Lecture 3: uniform convergence overview, finite
hypothesis class [Scribe notes]
- Mon 10/01: Homework 1 out
- Wed 10/03: Lecture 4: naive epsilon-cover argument, concentration inequalities
- Wed 10/03: Homework 0 (warmup) due
- Mon 10/08: Lecture 5: Sub-Gaussian random variables, Rademacher complexity
- Wed 10/10: Lecture 6: Rademacher complexity, margin theory
- Wed 10/10: Homework 1 due
- Thu 10/11: Homework 2 out
- Mon 10/15: Lecture 7: Rademacher complexity, neural networks
- Wed 10/17: Lecture 8: Margin-based generalization error of
two-layer neural networks
- Mon 10/22: Lecture 9: VC dimension, covering techniques
- Wed 10/24: Lecture 10: Covering techniques, overview of GANs
[Please refer to Percy's notes
(page 88-95) for the covering part and Lecture 11 for the
overview of GANs]
- Mon 10/29: Lecture 11: Total variation distance, Wasserstein distance, Wasserstein GANs
- Wed 10/31: Lecture 12: Generalization and approximation in
- Thu 11/01: Homework 2 (uniform convergence) due
- Thu 11/01: Homework 3 out
- Mon 11/05: Lecture 13: Restricted Approximability, overview of
- Wed 11/07: Lecture 14: Online learning, online convex optimization, Follow the Leader (FTL) algorithm
[Additional Reading (Notes by Haipeng Luo)]
- Mon 11/12: Lecture 15: Follow the Regularized Leader (FTRL) algorithm
- Wed 11/14: Lecture 16: FTRL in concrete problems: online regression & expert problem, convex to linear reduction
- Wed 11/14: Exam (6-10pm, classroom: Bishop Auditorium, Lathrop296)
- Mon 11/26: Lecture 17: Multi-armed bandit problem, general OCO with partial observation
- Wed 11/28: Lecture 18: Multi-armed bandit problem in the
- Wed 11/28: Homework 3 due
- Mon 12/03: Lecture 19: Regret bound for UCB, Bayesian setup,
- Wed 12/05: Lecture 20: Information theory, regret bound for
- Fri 12/07: Paper review due
- Fri 12/07: Final project due (if you didn't do the paper review)