Representations, fairness, and privacy: information-theoretic tools for machine learning

Flavio P. Calmon
Assistant Professor, Harvard
Date: Sept. 28, 2018

Abstract

Information theory can shed light on the algorithm-independent limits of learning from data and serve as a design driver for new machine learning algorithms. In this talk, we discuss a set of flexible information-theoretic tools called the principal inertia components (PICs) that can be used to (i) understand fairness and discrimination in machine learning models, (ii) provide an estimation-theoretic view of privacy, and (iii) characterize data representations learned by complex learning models. The PICs enjoy a long history in both the statistics and information theory, and provide a fine-grained decomposition of the dependence between two random variables. We illustrate these techniques in both synthetic and real-world datasets, and discuss future research directions.

Bio

Flavio P. Calmon is an Assistant Professor of Electrical Engineering at Harvard's John A. Paulson School of Engineering and Applied Sciences. Before joining Harvard, he was the inaugural data science for social good post-doctoral fellow at IBM Research in Yorktown Heights, New York. He received his Ph.D. in Electrical Engineering and Computer Science at MIT. His main research interests are information theory, inference, and statistics, with applications to privacy, fairness, machine learning, and content distribution. Prof. Calmon has received the IBM Open Collaborative Research Award, the Lemann Brazil Research Award, and the Harvard Dean’s Competitive Fund for Promising Scholarship.