# jemdoc: menu{MENU}{index.html}, showsource
= Statistics 311/Electrical Engineering 377: Information Theory and Statistics
[http://www.stanford.edu/~jduchi/ John Duchi], Stanford University, Winter 2019
== Lectures
Tuesday and Thursday, 1:30 - 2:50 PM in Math Building 380-380Y
== Course Staff Email Address
stats311-win1819-staff_albatross_lists_duck_stanford_duck_edu (replace _albatross_ with @ and _duck_ with .)
== Instructor
John Duchi
- Office hours: Tuesdays and Thursdays, 3:00pm - 4:00pm, 126 Sequoia Hall.
== Teaching Assistants
Leighton Barnes
- Office hours: TBD.
== Prerequisites
Mathematical maturity and any convex combination of Stats 300A, Stats 310A, CS229, EE376a
== Description
Information theory was developed to solve fundamental problems in the
theory of communications, but its connections to statistical
estimation and inference date nearly to the birth of the field. With
their focus on fundamental limits, information theoretic techniques
have provided deep insights into optimal procedures for a variety of
inferential tasks. In addition, the basic quantities of information
theory--entropy and relative entropy and their generalizations to
other divergence measures such as f-divergences--are central in many
areas of mathematical statistics and probability.
The application of these tools are numerous. In mathematical
statistics, for example, they allow characterization of optimal error
probabilities in hypothesis testing, determination of minimax rates of
convergence for estimation problems, demonstration of equivalence
between (ostensibly) different estimation problems, and lead to
penalized estimators and the minimum description length principle. In
probability, they provide insights into the central limit theorem,
large deviations theory (via Sanov's theorem and other results), and
appear in empirical process theory and concentration of
measure. Information theoretic techniques also arise in game playing,
gambling, stochastic optimization and approximation, among other
areas.
In this course, we will study information theoretic quantities, and
their connections to estimation and statistics, in some depth, showing
applications to many of the areas above. Except to provide background,
we will not cover standard information-theoretic topics such as
source-coding or channel-coding, focusing on the probabilistic and
statistical consequences of information theory.
== Texts
Required:
- [lecture-notes.pdf Lecture notes] Lecture notes I am preparing for
the course. These will change throughout the course, as I am rewriting
them and trying to make them reflect the actual content of the course, so be sure to reload them frequently.
Recommended: the following texts are not necessary, but will give additional perspective on the material in the class.
- [https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-book.pdf /High Dimensional Probability/], by Roman Vershynin, to be published by Cambridge University Press.
- [https://www.math.uci.edu/~rvershyn/papers/non-asymptotic-rmt-plain.pdf Introduction to the non-asymptotic analysis of random
matrices], by Roman Vershynin, 2012.
- [https://link.springer.com/book/10.1007/b13794 /Introduction to Nonparametric Estimation/], by Alexandre Tsybakov, 2009.
- /Elements of Information Theory/, Second Edition, by Cover and Thomas, 2006. Wiley.
- /Concentration Inequalities A Nonasymptotic Theory of Independence/, by Stephane Boucheron, Gabor Lugosi, and Pascal Massart, 2013. Oxford University Press.
== Grading
Your grade will be determined by approximately four problem sets (50\%) and a final project (50\%).