
About
I am a postdoctoral research scholar and a lecturer in the Statistics department at Stanford University. Previously, I graduated with a PhD in electrical engineering from Stanford University. My research focuses on the mathematical principles of data science and information processing. My recent work explores subtle classification and detection problems involving a vast number of features. My approach, based on multiple hypothesis testing and the Higher Criticism (HC), is particularly appealing when only few features out of possible many are useful while those useful are relatively weak. This approach enables the use of HC and other tests as unsupervised, untrained discriminators: no model is specified for the features hence no learning or tuning is done. This property makes the method incredibly useful in a host of realworld applications. Examples include text classification, detecting mutations in short sequence genomic data, trend prediction in highdimensional data, social graph analysis, automatic selection of words for topic modeling, and early detection of economic or health crises. As a part of this research, I conducted a massive number of computational experiments to verify the usefulness of the method in these applications. Another line of my work studies the effect of compression or communication constraints on modern estimation and learning procedures. The disproportional size of datasets compared to computing and communication resources, as well as the wide adoption of cloud computing infrastracture, making such constraints among the most limiting factors in modern data science applications. My work studies fundamental limits of inference and ways to adapt standard methods in statistics and machine learning to the scenario where the data undergoes lossy compression. My Ph.D. work addressed the effect of data compression on sampling realworld analog data. It provided the first complete characterization of the fundamental tradeoff between sampling rate, compression bitrate, and system performance for any digital system processing realworld signal. This tradeoff extends the classical ShannonNyquist sampling theorem to the (practical) situation where the samples are quantized or compressed in a lossy manner. In particular, my work shows that, for most signal models, a bitrate constraint imposes a new sampling rate (smaller than Nyquist) above which the signal is optimally represented. 
Twosample testing for large highdimensional multinomialsMy recent work shows that the HC test has interesting optimality properties when applied to detecting differences between discrete distributions. In the most challenging situation, when the difference between the distributions is sparse and weak, the adapted HC test expriences a phase transition phenomenon. It turns out that HC performs better than any other known test in terms of this phase transition analysis. Based on this insight, I developed and applied an HCbased test to classify text documents and detect authorship; see this page for more details. When applied to authorship attribution challenges, this technique performs aswellas stateoftheart methods but without handcrafting or tuning. 

Gaussian Approximation of Quantization Error for Estiamtion from Compressed DataThis work considers the performance in estimating or learning from datasets undergoing bitlevel data compression. In general, it is difficult to apply existing results from statistics and learning theory to such data because of the nonlinearity of the compression operation. To address this issue, this work considers the distributional connection between the lossy compressed representation of a highdimensional signal using a random spherical code and the observation of this signal under an additive white Gaussian noise (AWGN). This connection allows us to characterize the risk of an estimator based on an AWGNcorrupted version of the data to the risk attained by the same estimator when fed with its lossy compressed version. We demonstrate the usefulness of this connection by deriving various novel results for inference problems under compression constraints, including sparse regression and compressedsensing under quantization or communication constraints. 

SubNyquist sampling under quantization and lossy compression constraintsMy PhD work provided a unified treatment of processing data under three fundamental information inhibiting processes: sampling, bitlevel data compression, and random noise. While the effect of each of these processes has been wellunderstood before, my work shows that the combination of them undermines standard conceptions. The highlight of this work is a fundamental result showing that sampling at a rate smaller than Nyquist is optimal when the compression/quantization bitrate of the digital representation is restricted. This restriction may be the result of limited memory or communication rates as in selfdriving cars, or limited processing power as in most modern data science applications. In other words, while sampling at or above Nyquist is needed for optimality when the bitrate is unlimited (which is never the case in practice), for each finite bitrate there exists a new critical sampling rate that is smaller than Nyquist and also depends on the bitrate. Sampling at this new rate leads to the optimal performance under the bitrate constraint. 
You can read more about my past work and future plans on my Research Statement page.
Room 208 at Sequoia Hall. My email address is alonkipnis at gmail.