Instructor: John Duchi (jduchi stanford edu)
Lectures: Tuesday/Thursday, 13:15--14:30, building 200 (History corner) room 030
TA: Yuming Kuang (ymkuang stanford edu)
Office hours: John: Tuesday/Thurdsay 14:30 -- 15:30 in 126 Sequoia Hall. Yuming: Wednesday 13:30 - 15:30 in 216 Sequoia Hall.
Description: Information theory was developed to solve
fundamental problems in the theory of communications, but its
connections to statistical estimation and inference date nearly
to the birth of the field. With their focus on fundamental
limits, information theoretic techniques have provided deep
insights into optimal procedures for a variety of inferential
tasks. In addition, the basic quantities of information
theory--entropy and relative entropy and their generalizations
to other divergence measures such as f-divergences--are
central in many areas of mathematical statistics and
probability.
The application of these tools are numerous. In mathematical
statistics, for example, they allow characterization of optimal error
probabilities in hypothesis testing, determination of minimax rates of
convergence for estimation problems, demonstration of equivalence
between (ostensibly) different estimation problems, and lead to
penalized estimators and the minimum description length principle. In
probability, they provide insights into the central limit theorem,
large deviations theory (via Sanov's theorem and other results), and
appear in empirical process theory and concentration of
measure. Information theoretic techniques also arise in game playing,
gambling, stochastic optimization and approximation, among other
areas.
In this course, we will study information theoretic quantities,
and their connections to estimation and statistics, in some
depth, showing applications to many of the areas above. Except
to provide background, we will not cover standard
information-theoretic topics such as source-coding or
channel-coding, focusing on the probabilistic and statistical
consequences of information theory.
Here is a more precise
syllabus
including evaluation criteria and topics.
Readings listed below are supplementary to the lectures
and not necessary, but should give additional
perspective on the material.
Cover and Thomas: Elements of Information Theory, Second Edition, Wiley 2006.
Tsybakov: Introduction to Nonparametric Estimation, Springer 2009.
Date | Topics | Reading | Assignments |
September 23 [pdf] |
Course overview | ||
September 25 [pdf] |
Review of information theory
Divergences |
Cover and Thomas, Chapter 2 | Problem Set 1 Out [pdf] |
September 30 No Class--travelling |
N/A | N/A | |
October 2 [pdf] |
Minimax lower bounds Le Cam's method |
Tsybakov Chapters 2.1-2.4 |
|
October 7 [pdf] |
Minimax lower bounds
Fano's method and packing |
Tsybakov Chapters 2.5-2.7 |
Problem Set 1 due. Solutions: [pdf] |
Rescheduled: October 10 [pdf] |
Minimax lower bounds
Fano's method, Assouad's method |
Problem Set 2 out [pdf] |
|
October 14 [pdf] |
Nonparametrics and minimax
bounds
Assouad's method, nonparametric regression |
Tsybakov Chapters 1.1-1.2 and 1.5 |
|
October 16 [pdf] |
Global Fano Method | Yang and Barron 1999 | |
October 21 [pdf] |
Exponential familes and maximum entropy |
Wainwright and Jordan, Chapter 3 | |
October 23 [pdf] |
Maximum entropy and robustness |
Problem Set 2 due. Solutions: [pdf] |
|
October 28 [pdf] |
Fisher information and associated metrics |
Problem Set 3 out [pdf] |
|
October 30 [pdf] |
Regret and redundancy | ||
November 4 [pdf] |
Bayesian redundancy, Jeffreys prior,
reference prior, redundancy capacity |
||
November 6 [pdf] |
Prediction games without the log loss | ||
November 11 [pdf] |
Finishing prediction games Surrogate risk consistency |
Problem set 3 due. Solutions: [pdf] |
|
November 13 [Draft outline] |
Surrogate risk consistency
Classification calibration |
||
November 18 [Draft outline] |
Risk and divergences Binary experiments and classification |
Problem set 4 out [pdf] |
|
November 20 [Draft outline] |
Binary experiments Concentration inequality intro |
||
December 2 [Draft outline] |
Concentration inequalities The entropy method |
Problem set 4 due. Solutions: [pdf] |