Information-theoretic Lower Bounds in Data Science

“Grant me the serenity to accept the things I cannot change, courage to change the things I can, and wisdom to know the difference,” Reinhold Niebuhr (1892–1971)

Description:

Much scientific research is devoted to providing constructive and explicit solutions to real problems. However, there is another important line of research aimed at an understanding of what is impossible. Impossibility results allow to characterize fundamental limits, spot unrealistic targets, provide optimality guarantees for constructive procedures, and recognize bottlenecks in problem formulations that may call for improvement. Establishing impossibility results is a common task in various fields of data science; for example, what is the smallest error one could achieve in a classification problem? How many iterations do we need to approach the optimal solution in optimization? How much computation or communication or memory do we need to learn the distribution of the data? What is the optimal data querying or collection scheme in online and reinforcement learning? What is the best representation or coding of a message in a given communication channel?

This course aims to explore the use of information-theoretic lower bounds in data science. From the theory side, we will provide an extensive set of ideas and tools for establishing such bounds, providing a unified exposition of seemingly disparate ideas from different fields and problems. From the applied side, we will illustrate the efficacy of these ideas in a wide range of problems - spanning machine learning, statistics, information theory, theoretical computer science, optimization, online learning and bandits, operations research, and more - and provide guidelines for choosing and using the appropriate set of tools for a given problem. Tools and examples are the core elements of this course, and they will be considered in a multitude of data scientific contexts.

Prerequisites:

EE 278, or CS 229T, or STATS 300A, or equivalent, or Instructor’s permission. This course will be mostly self-contained, but requires mathematical maturity at a graduate level. Background in statistics, information theory, machine learning, and/or optimization is recommended.

Course requirements:

There will be bi-weekly (once every two weeks) homework sets covering both theoretical and applied aspects of the material. There will be a strong project element involving a literature review and a research component (theoretical and or applied, as per the student’s inclination).

Textbook and reference:

There is no required textbook for this course. Some related materials include:
Alexandre Tsabykov. Introduction to nonparametric estimation.
John C. Duchi. http://web.stanford.edu/class/stats311/lecture-notes.pdf
Yihong Wu. http://www.stat.yale.edu/~yw562/teaching/it-stats.pdf
Yanjun Han. https://theinformaticists.com/category/blog/online-lectures/
A list of reading materials could be found in the project guideline.