Research

Research Interests

My research interests are primarily in statistical machine learning, including

  • Scalable algorithms for massive data sets

  • Multivariate analysis and dimensionality reduction in large data sets

  • Data mining and selective inference

  • Methods for stable estimation and inference in heavy-tailed data

  • Ecological statistics and the so-called “presence-only problem”

Publications and Preprints

Selection Adjusted Confidence Intervals with More Power to Determine the Sign, Asaf Weinstein, William Fithian, and Yoav Benjamini, 2012, JASA.

Inference from Presence-Only Data: the Ongoing Controversy, (pdf) Trevor Hastie and William Fithian, 2013, Ecography.

Finite Sample Equivalence in Statistical Models for Presence-Only Data, William Fithian and Trevor Hastie, 2013, Annals of Applied Statistics.

Local Case-Control Sampling: Efficient Subsampling in Imbalanced Data Sets, William Fithian and Trevor Hastie, 2014, Annals of Statistics (to appear).

A Proportional Observer Bias Model for Multispecies Distribution Modeling, William Fithian, Jane Elith, Trevor Hastie, and David A. Keith. 2014, Methods in Ecology and Evolution (to appear).

Semiparametric Exponential Families for Heavy-Tailed Data, William Fithian and Stefan Wager. Submitted.

Scalable Convex Methods for Flexible Low-Rank Matrix Modeling, William Fithian and Rahul Mazumder. Submitted.

Effective Degrees of Freedom: A Flawed Metaphor, Lucas Janson, William Fithian, and Trevor Hastie. Submitted.

Altitude Training: Strong Bounds for Single-Layer Dropout, Stefan Wager, William Fithian, Sida Wang, and Percy Liang. Submitted