Alon Kipnis

math/stats • information theory • ambitious data science


Two exciting announcments:
  • I have recently joined SybarIP as the Head of AI and Data Science
  • In 2022 I will be joining the computer science department at IDC

Between 2017 - 2021, I was a postdoctoral research scholar and a lecturer in the Department of Statistics at Stanford University. Previously, I graduated with a PhD in electrical engineering from Stanford University. My research is in the area of mathematical statistics, information theory, signal processing, and natural language processing.

My postdoctoral work explored classification and detection problems involving a vast number of features while only few features are actually useful. My approach to this problem adapts the Higher Criticism (HC) test as an unsupervised, untrained discriminator between datasets. The method turns out to be incredibly useful in a host of real-world applications such as: text classification, detecting mutations in short sequence genomic data, predicting trends in vector time-series data, analyzing social graphs, selecting words for topic modeling, and the early detection of economic or health crises.

Another line of my work studies the effect of data compression and communication constraints on modern estimation and learning procedures. The disproportional size of datasets compared to computing and communication resources, as well as the wide adoption of cloud computing infrastracture, making such constraints among the most limiting factors in modern data science applications. My work studies fundamental limits of inference and ways to adapt standard methods in statistics and machine learning to the scenario where the data undergoes lossy compression.

My Ph.D. work addresses the effect of data compression on sampling real-world analog data. It provides a complete characterization of the fundamental trade-off between sampling rate, compression bitrate, and system performance for any digital system processing real-world signal. This tradeoff extends the classical Shannon-Nyquist sampling theorem to the (practical) situation where the samples are quantized or compressed in a lossy manner for processing in digital systems. In particular, my work shows that, for most signal models, a bitrate constraint imposes a new sampling rate that is smaller than the Nyquist rate above which the signal is optimally represented in digital.

Between 2020-2021 I have taught Stanford's STATS 207: Introduction to Time-Series Analysis and STATS 235: Massive Computational Experiments, Painlessly (with David Donoho and Masah Lotfi).
Profile picture 

Selected Works

Two-sample testing for large high-dimensional multinomials

My recent work shows that the HC test has interesting optimality properties when applied to detecting differences between discrete distributions. In the most challenging situation, when the difference between the distributions is sparse and weak, the adapted HC test expriences a phase transition phenomenon.

It turns out that HC performs better than any other known test in terms of this phase transition analysis. Based on this insight, I developed and applied an HC-based test to classify text documents and detect authorship; see this page for more details. When applied to authorship attribution challenges, this technique performs as-well-as state-of-the-art methods but without handcrafting or tuning.

Phase Transition

Gaussian Approximation of Quantization Error for Estiamtion from Compressed Data

This work considers the performance in estimating or learning from datasets undergoing bit-level data compression. In general, it is difficult to apply existing results from statistics and learning theory to such data because of the non-linearity of the compression operation. To address this issue, this work considers the distributional connection between the lossy compressed representation of a high-dimensional signal using a random spherical code and the observation of this signal under an additive white Gaussian noise (AWGN). This connection allows us to characterize the risk of an estimator based on an AWGN-corrupted version of the data to the risk attained by the same estimator when fed with its lossy compressed version. We demonstrate the usefulness of this connection by deriving various novel results for inference problems under compression constraints, including sparse regression and compressed-sensing under quantization or communication constraints.

Sub-Nyquist sampling under quantization and lossy compression constraints

My PhD work provided a unified treatment of processing data under three fundamental information inhibiting processes: sampling, bit-level data compression, and random noise. While the effect of each of these processes has been well-understood before, my work shows that the combination of them undermines standard conceptions. The highlight of this work is a fundamental result showing that sampling at a rate smaller than Nyquist is optimal when the compression/quantization bitrate of the digital representation is restricted. This restriction may be the result of limited memory or communication rates as in self-driving cars, or limited processing power as in most modern data science applications.

In other words, while sampling at or above Nyquist is needed for optimality when the bitrate is unlimited (which is never the case in practice), for each finite bitrate there exists a new critical sampling rate that is smaller than Nyquist and also depends on the bitrate. Sampling at this new rate leads to the optimal performance under the bitrate constraint.

Sub-Nyquist sampling is optimal under bitrate constraints

You can read more about my past work and future plans on my Research Statement page.


Room 208 at Sequoia Hall. My email address is alonkipnis at gmail.