We will follow the following outline
0. Overview
1. Basic Concepts (1 week)
2. Entropy and data compression (2 weeks)
Source entropy rate. Typical sequences and asymptotic equipartition property.
Source coding theorem. Huffman code.
Universal compression and distribution estimation.
3. Mutual information, capacity and communication (34 weeks)
Channel capacity. Fano's inequality and data processing theorem.
Jointly typical sequences. Noisy channel coding theorem.
Achieving capacity efficiently: polar codes.
Gaussian channels, continuous random variables and differential entropies.
4. Applications to statistics and machine learning (3 weeks)
Maximum entropy principle, maximum conditional entropy principle, and the duality to maximum likelihood.
Method of types and applications to hypothesis testing.
Information limits of other inference problems.
Estimation of information measures.
