Computer Age Statistical Inference:Algorithms, Evidence and Data Science
by Bradley Efron and Trevor Hastie (August 2016)
Statistical Learning with Sparsity: the Lasso and Generalizations
by Trevor Hastie, Robert Tibshirani and Martin Wainwright (May 2015)
pdf (10.5Mb, corrected online)
An Introduction to Statistical Learning with Applications in R
by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (June 2013)
pdf (9.4Mb, 6th corrected printing)
The Elements of Statistical Learning:
Data Mining, Inference, and Prediction (Second Edition)
by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009)
pdf (13Mb, correct. 11th print)
The research reported here was partially supported by grants from the
National Science Foundation and the National Institutes of Health.
For medical papers see also
- Scott Powers, Trevor Hastie and Robert Tibshirani Customized training with an application to mass spectrometric imaging of cancer tissue. Annals of Applied Statistics 9(4) (2015), 1709-1725.
- Rakesh Achanta and Trevor Hastie Telugu OCR Framework using Deep Learning.
We build an end-to-end OCR system for Telugu script, that segments the text image, classifies the characters and extracts lines using a language model.The classification module, which is the most challenging task of the three, is a deep convolutional neural network.
- Jingshu Wang, Qingyuan Zhao, Trevor Hastie and Art Owen. Confounder Adjustment in Multiple Hypotheses Testing.
We present a unified framework for analysing different proposals for adjusting for confounders in multiple testing (e.g. in genomics). We also provide an R package
cate on CRAN that implements these different approaches. The vignette shows some examples of how to use it.
- Alexandra Chouldechova and Trevor Hastie
Generalized Additive Model Selection A method for selecting terms in an additive model, with sticky selection between null, linear and nonlinear terms, as well as the amount of nonlinearity. The R package gamsel has been uploaded to CRAN.
- Trevor Hastie, Rahul Mazumder, Jason Lee and Reza Zadeh.
Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares We develop a new method for matrix completion, that improves upon our earlier softImpute algorithm, as well as the popular ALS algorithm.
JMLR 2015 16 3367-3402. We have also incorporated this method in our
R package softImpute.
- William Fithian, Jane Elith, Trevor Hastie, David A. Keith. Bias Correction in Species Distribution Models: Pooling Survey and Collection Data for Multiple Species We develop methods for teasing out observer-bias in multi-species presence-only data. (submitted, arXiv:1403.7274) ( Methods for Ecology and Evolution, October 10, 2014).
- Will Fithian and Trevor Hastie.
Local Case-Control Sampling: Efficient Subsampling in Imbalanced Data Sets (arXiv). Modern classification tasks are often extremely unbalanced - a situation where case-control sampling can be very helpful. This paper discusses the bias of CC sampling, and proposes a simple two-stage subsampling procedure for removing this bias, and in cases dramatically improving the efficiency.
Annals of Statistics 2014, Vol. 42, No. 5, 1693-1724.
- Lucas Janson, Will Fithian, and Trevor Hastie. Effective Degrees of Freedom: a Flawed Metaphor. The popular covariance formula for df gives some surprising results, like df>p in forward stepwise.
May 2015, Biometrika
- Stefan Wager, Trevor Hastie and Bradley Efron. Confidence Intervals for Random Forests: the Jackknife and the Infinitessimal Jackknife. We use ideas related to OOB errors to compute standard errors for bagging and random forests. Two approaches are presented, one based on the jacknife, the other on the infinitessimal jacknife. We study the bias of these estimates, as well as monte-carlo errors. .
(JMLR 2014, 15 1625-1651)
An R package is available for computing these estimates, currently residing on Stefan Wager's GitHub space; see the example.R file.
- David Warton, Bill Shipley and Trevor Hastie. CATS regression - a model-based approach to studying trait-based community assembly. We show how to use GLMs to fit community models, which are traditionally fit by maximum entropy. Apart from being a convenient platform for model fitting, a all the usual summaries, statistics and extensions of GLMs are available. ( Methods of Ecology and Evolution, September 29, 2014)
- Hristo Paskov, Robert West, John Mitchell and Trevor Hastie. Compressive Feature Learning We use an unsupervised convex ducument compression algorithm to derive a sparse k-gram representation for a corpus of documents. This same dictionary, in the spirit of "deep learning", is as good as the original k-gram representation for document classification tasks. To appear, NIPS 2013.
- My first blog post with Will Fithian. This post refers to concerns that were raised about cross-validation.
- Noah Simon, Jerome Friedman and Trevor Hastie. A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression. We use the group lasso in the context of multinomial and multi-response regression. Each variable has multiple coefficients for the different responses, and they each get selected via a group lasso penalty. Our code is an efficient implementation of block coordinate descent, and is built into the glmnet package. (submitted, on arXiv).
- Michael Lim and Trevor Hastie. Learning interactions via hierarchical group-lasso regularization We use the overlap group lasso in the context of a linear model to test for interactions. Our methodology can handle qualitative as well as quantitative variables. Our R package
glinternet can fit linear and logistic regression models. Optimized code can handle thousands of variables (our largest example had > 20K 3-level factors). arXiv:1308.2719 ( JCGS 2014, online access)
- Michael Jordan et al. Frontiers in Massive Data Analysis.
This 129 page document is the report produced by the Committee on the Analysis of Massive Data. This committee was established by the National Research Council of the National Acadamies, and met 4 times over 2011-2012 in Washington and California. Michael Jordan was the chair of the 18 member committee, made up of statisticians (5), computer scientists and mathematicians. I was a member of the committee, and was jointly responsible for Chapter 7 with David Madigan, although all committee members provided input to all chapters as well.
- Trevor Hastie and Will Fithian.
Inference from Presence-only Data: the Ongoing Controversy. This short paper argues strongly against the use of rigid parametric logistic regression models to make inferences from presence-only data in Ecology. Essentially the rigidity of the model manufactures information that is not present in the data. Ecography (2013, editors choice)
Video interview with David Warton at Ecostats conference at UNSW in Sydney in July 2013. David was wearing his editor's hat for
Methods in Ecology and Evolution, and the discussion centered on this paper.
- Noah Simon, Jerome Friedman, Trevor Hastie and Rob Tibshirani.
The Sparse Group Lasso
By mixing L1 penalties with group-lasso L2 penalties, we achieve a sparse group lasso where some members of a group can end up being zero.
JCGS, May 2013, 22(2), pages 231-245.
- Jianqiang Wang and Trevor Hastie.
Boosted Varying-Coefficient Regression Models for Product Demand Prediction.
We use the varying coefficient paradigm to fit a market segmented product demand model, with boosted regression trees as the nonparametric component. JCGS (online access) March 2013.
Julia Viladomat, Rahul Mazumder, Alex McInturff, Douglas McCauley and Trevor Hastie
Assessing the significance of global and local
correlations under spatial autocorrelation; a nonparametric approach.
Variables collected over a spatial domain often exhibit strong spatial autocorrelation.
When such variables are used in a regression, pairwise correlation analysis, or in the popular geographically weighted regression, it can be difficult to assess significance. We propose a general approach based on randomization followed by smoothing to restore the spatial correlation structure.
R code used in the paper. (Biometrics, Jan 2014)
- Will Fithian and Trevor Hastie. Finite-sample equivalence in statistical models for presence-only data. We show that a lot of different approaches to presence-only data are the same, in particular inhomogenous poisson proccesses, maxent, and naive logistic regression (when weighted appropriately).
(AoAS 2013 7 (4) 1917-1939 ).
- Jason Lee and Trevor Hastie. Learning Mixed Graphical Models.
We use group-lasso regularized pseudo likelihood for learning the structure of a graphical model with mixed discrete and continuous variables. Our model respects the symmetry imposed by a Markov random field representation --- each of the potentials gets a vote from a pair of regression models (Gaussian, logistic or multinomial), where each of the pair of variables is the response and predictor. (in Arxiv,, and to appear JCGS).
Go to Jason Lee's webpage for matlab code and a demo.
Rahul Mazumder, Jerome Friedman and Trevor Hastie. Sparsenet R package on CRAN. Fits sparse solution paths for linear models (square-error loss) using coordinate descent with MC+ penalty family.
Software is very fast, and can handle many thousands of variables. Functions for cross-validation, prediction, plotting etc. Based on algorithms described in SparseNet
Coordinate Descent with Non-Convex Penalties. JASA 2011, 106(495) 1125-1138.
- Rahul Mazumder and Trevor Hastie
Exact Covariance Thresholding into Connected
Components for large-scale Graphical Lasso A block screening rule for graphical models that can dramatically speed up computations, and allow for distributed computing of models on a much larger scale. Original copy arXiv 8/18/2011 published JMLR March 2012 13, 723-736.
- Rahul Mazumder and Trevor Hastie
The Graphical Lasso: New Insights and Alternatives (arXiv 11/23/2011, published November 2012) We examine the glasso algorithm for solving the graphical lasso problem. We show that it solves the dual problem,
where the optimization variable is the covariance rather than the precision matrix. We propose a similar primal algorithm, which appears to be superior in speed and other propertiers. Electronic Journal of Statistics 20126 2125-2149.
R package dpglasso
- Gen Nowak; Trevor Hastie; Jonathan R. Pollack; Robert Tibshirani
A fused lasso latent feature model for analyzing multi-sample aCGH data.
Biostatistics (advanced access June 2011). A new approach for modeling multisample CGH data, that exploits the similarities in copy-number variation at the same locations in the genome. Here is the
online link at the Biostatistics journal website.
The R package FLLat is available from CRAN.
- Noah Simon, Jerome Friedman, Trevor Hastie and Rob Tibshirani
Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.
We develop the tools needed to include the Cox proprtional hazard's model in GLMNET.
Journal of Statistical Software 39(5), 1-13 (2011).
- Rob Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Taylor and Ryan Tibshirani:
rules for discarding predictors in lasso-type problems. We develop
rules for screening predictors for lasso and elastic-net penalized
models. When p>>n, this can result in large computational savings,
without any loss in accuracy. JRSS B (2012) 74
Jane Elith, Steven Phillips, Trevor Hastie, Miroslav Dudik, Yung En
Chee and Colin Yates A
statistical explanation of Maxent for Ecologists
Maxent is a method for modeling species prevalence with
presence-background data. This paper explains maxent using the language of a
statistical models. Diversity and Distributions November 2010
Jerome Friedman, Trevor Hastie and Rob Tibshirani:
Applications of the lasso and grouped lasso to the estimation of sparse graphical models
We develop efficient algorithms for fitting sparse, undirected graphical models. These are specially designed for the "large p" situation.
Jerome Friedman, Trevor Hastie and Robert Tibshirani:
A Note on the Group Lasso and a Sparse Group Lasso.
We develop a group lasso with both sparsity of groups and sparsity within groups.
We also develop coordinate-wise algorithms for fitting the both cases.
- Rahul Mazumder, Jerome Friedman and Trevor Hastie:
Coordinate Descent with Non-Convex Penalties. JASA 2011, 106(495) 1125-1138. Non-convex penalties
produce sparser models than the LASSO, but pose difficulties for
optimization. We propose a structured algorithm using coordinate
descent which finds good solutions with guaranteed convergence
for SparseNet paper with extra figures and some additional technical proofs.
sparsenet R package available from CRAN (Feb 2012)
- Rahul Mazumder, Trevor Hastie and Rob Tibshirani:
Spectral Regularization Algorithms for Learning Large
Incomplete Matrices. We develop
an iterative algorithm for matrix completion using nuclear-norm regularization. JMLR 2010 11 2287-2322
MATLAB package SoftImpute for matrix completion (zip archive).
R package to appear soon.
Daniela Witten, Rob Tibshirani and Trevor Hastie:
A penalized matrix
decomposition, with applications to sparse canonical correlation analysis and
principal components Biostatistics 10(3)
Trevor Hastie, Robert Tibshirani and Jerome Friedman, Elements
of Statistical Learning: Data Mining, Inference and
Prediction (Second Edition). February, 2009. 745 pages in full color. Springer-Verlag, New York.
This second edition adds 4 new chapters: Random Forests, Ensemble Learning, Undirected Graphical Models, and High Dimensional Problems: p>>N. For more details see
ESL book homepage.
In an agreement with Springer, we are able to offer for free the
ESL book pdf (8.2M).
- Tong Tong Wu, Yi Fang Chen, Trevor Hastie, Eric Sobel and Kenneth
Association Analysis by Lasso Penalized Logistic
We develop efficient computational procedures for screening
large-scale genome-wide association studies.
Bioinformatics 25(6): 714-721, 2009.
Ping Li, Ken Church and Trevor Hastie:
sketch for all: theory and application of conditional
random sampling Nips08 proceedings.
Jerome Friedman, Trevor Hastie and Robert Tibshirani:
Regularized Paths for Generalized Linear Models via Coordinate Descent.
We use coordinate descent to develop regularization paths for linear,
logistic and multinomial regression models. Our algorithms use the
"elastic net" penalties of Zou and Hastie (2005), and create the
path for a grid of values of the penalty parameter lambda.
Journal of Statistical Software, 33(1), 2010
package glmnet is available from
A matlab wrapper for the glmnet fortran code, written by Hui Jiang.
Jane Elith, John Leathwick and Trevor Hastie:
working guide to boosted regression trees (2008) Journal of
Animal Ecology, 77 802-813. Here are the
online supplement materials, along with the associated
In 2004 received award for most highly cited paper in any of the British Ecological Society journals in the past 5 years.
Line Clemmensen, Trevor Hastie, Daniela Witten and Bjarne Ersboll:
Sparse Discriminant Analysis.
We extend penalized linear and mixture discriminant analysis by
incorporating a lasso penalty to encourage sparseness. Technometrics (2011)
John Leathwick, Jane Elith, W. Chadderton, D. Rowe and Trevor Hastie:
Dispersal, disturbance and the contrasting biogeophraphies of New
Zealand's diadromous and non-diadronous fish species.
An application of boosted regression trees in ecological mapping.
J. Biogeography 2008, 35 1481-1497.
Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani:
"Preconditioning" for feature selection and regression in
We show that supervised principal components followed by a variable selection
procedure is an effective approach for variable selection in very high dimension.
Annals of Statistics 36(4), 2008, 1595-1618.
Jerome Friedman, Trevor Hastie and Robert Tibshirani,
Sparse inverse covariance estimation with the lasso.
We develop an effecient algorithm for solving the L1-penalized
likelihood approach to sparse covariance estimation.
- Trevor Hastie
Comment on a paper in Statistical Science by Peter Bühlmann and Torsten Hothorn: Boosting
Algorithms:Regularization, Prediction and Model Fitting (2007) 22(4),477-522.
- Jerome Friedman, Trevor Hastie and Robert Tibshirani,
Discussion of "Evidence contrary to the statistical view of boosting
(David Mease and Aaron Wyner)" Wyner and Mease show through
examples some counter-intuitive results with boosting, that appear to
contradict our 2000 paper. We discount these claims by reversing their
results using shrinkage along with boosting. JMLR9 (2008) 59-64.
- Jerome Friedman, Trevor Hastie, Holger Hoefling and Robert Tibshirani,
Pathwise Coordinate Optimization. We show how coordinate descent
algorithms can efficiently solve a number of popular regularized
optimization problems, creating an entire path of solutions. We
generalize this approach to derive an efficient algorithm for the
fused lasso, both one- and two-dimensional. Annals of Applied
Statistics (2007), 1(2), 302-332.
- Ping Li, Trevor Hastie and Kenneth Church.
Nonlinear Estimators and Tail Bounds for Dimension Reduction in L1
using Cauchy Random Projections. We provide improved methods for approximating L1 distances in very
high dimensions, based on maximum-likelihood estimation in the Cauchy
family. JLMR 8, pp 2497-2532
- Ping Li and Trevor Hastie.
A Unified Near-Optimal Estimator for Dimension Reduction in L_a
(0< a <= 2) Using Stable Random Projections. NIPS2007 poster
- Brad Efron, Trevor Hastie and Rob Tibshirani,
Discussion of the "Dantzig Selector" by Emmanuel Candes and Terrence
Candes and Tao propose an alternative but similar procedure to the
lasso. This discussion appears alongside the orginal article in the
Annals of Statistics 35(6),
- Trevor Hastie, Jonathan Taylor, Robert Tibshirani and Guenther
Forward Stagewise Regression and the Monotone Lasso
We characterize the incremental forward stagewise procedure as a
monotone version of the lasso. Electronic Journal of Statistics
- Trevor Hastie and Ji Zhu,
Discussion of "Support Vector Machines with Applications" by
Javier M. Moguerza and Alberto Munoz, Statistical Science 21(3)
- Gill Ward, Trevor Hastie, Simon Barry, Jane Elith and John
Presence-only data and the EM algorithm. We develop a method for
fitting the two-class logistic regression model using labeled data from
one class, a sample of unlabeled data, and knowledge of the
class prevalences.Biometrics,65(2)554-563, 2009.
Download the beta version of Gill Ward's R package ecogbm -
ecogbm_1.01.tar.gz - for fitting boosted regression models for presence-only data. See the
student section of my homepage for Gill Ward's thesis.
A presentation by Gill Ward based on this work "Making the Best Use of
Available Data: The Presence-Only Problem in Ecology" won an
honorable mention award at the 2007 Joint Statistical Meetings.
- John Leathwick, Jane Elith and Trevor Hastie, Comparative performance of generalized
additive models and multivariate adaptive regression splines for
statistical modelling of species distributions. (2006) Ecological
Modelling 199 188-196. This is a special issue of the
journal devoted to the workshop on
Advances in Predictive Species
Distribution Models held in Riederalp,Switzerland, 2004.
- Ping Li, Trevor Hastie and Kenneth Church,
Sparse Random Projections. A method for approximating pairwise distances in
very high-dimensional spaces. Best student paper, KDD-06,
- Mee-Young Park and Trevor Hastie,
Regularization Path Algorithms for Detecting Gene Interactions.
We develop a path algorithm for fitting the "cosso" models of Yuan &
Lin (2006) with logistic regression. This allows factors and interactions
to enter the model in a smooth way.
- Mee-Young Park and Trevor Hastie,
Penalized Logistic Regression for Detecting Gene Interactions.
A modified version of forward-stepwise logistic regression suitable
for screening large numbers of gene-gene interactions.
R package for fitting PLR models.
- Mee-Young Park, Trevor Hastie and Rob Tibshirani, Averaged gene expressions for regression
A regression method that combines the lasso with hierarchical
clustering, intended for selecting groups of genes in microarray
Biostatistics (2007, 8, 212-227).
for fitting these models.
Yaqian Guo, Trevor Hastie and Robert Tibshirani
Discriminant Analysis and its Application in Microarrays.
A method, similar to shrunken centroids, for classification and
discrimination of microarrays, using regularized discriminant analysis
with gene selection. Biostatistics (in press; epub)
- Mee-Young Park and Trevor Hastie, An L1 Regularization-path Algorithm for
Generalized Linear Models.
A generalization of the LARS algorithm for GLMs and the Cox
proportional hazard model. Since the coefficient
paths are piecewise-nonlinear, approximations are made using the
predictor-corrector algorithm of convext optimization.
glmpath: R software package for fitting L1 regularized GLMs and
Cox models. (JRSSB 2007 (69, part 4), pages 659-677 )
Ping Li, Trevor Hastie and Kenneth Church,
Improving Random Projections Using Marginal Information.
Methods for speeding up document search and characterization. Accepted
at Colt 2006
This paper draws on results in the following two technical
Rob Tibshirani and Trevor Hastie, Margin
Trees for High-dimensional Classification.
A tree-structured representation for a multiclass SVM classifier.
- Hui Zou, Ji Zhu and Trevor Hastie,
New Multicategory Boosting Algorithms
Based on Multicategory Fisher-Consistent
We provide some general requirements for multiclass margin-based
classifiers. Annals of Applied Statistics 2(4) pp 1290-1306, 2008).
- Ji Zhu, Hui Zhou, Saharon Rosset and Trevor Hastie,
A multi-class generalization of the Adaboost algorithm, based on a
generalization of the exponential loss.
published in 2009 in
Statistics and Its Interface Volume 2 (2009) 349-360.
Code on Ji Zhu's website
- J. Leathwick, J. Elith, M. Francis, T. Hastie, P. Taylor.
Variation in demersal fish species richness in the oceans surrounding
New Zealand: an analysis using boosted regression trees.
(Marine Ecology Progress Series, published in 2006).
A detailed analysis of species abundance using Poisson regression
with boosted regression trees. All analysis done using the gbm
package in R (Greg Ridgeway).
- J. Leathwick, D. Rowe, J. Richardson, J. Elith and T. Hastie,
Using multivariate adaptive regression splines to predict
the distributions of New Zealand's freshwater
Freshwater Biology 50 2034-2051.
Presence-absence species data are modelled using a MARS along with GLM
- Mee-Young Park and Trevor Hastie,
Hierarchical Classification using Shrunken Centroids.
A technique for classification when the number of classes is large. It
produces an hierarchically structured classification rule, with the
hardest-to-separate classes at the terminal nodes.
Hui Zou, Trevor Hastie, and Rob Tibshirani,
the "Degrees of Freedom" of the Lasso.
A technical paper that establishes that the number of non-zero
coefficients in a lasso model is unbiassed for the effective degrees
of freedom. Published in Annals of Statistics (2007),
35, 5, 2173-2192.
Philip Beineke, Trevor Hastie and Shivakumar Vaithyanathan,
Sentimental Factor: Improving Review Classification via Human-Provided
Information Proceedings ACL 2004, Barcelona. (ACL: Association of
Eric Bair, Trevor Hastie, Debashis Paul, and Robert Tibshirani
Prediction by Supervised Principal Components Published in
101 No 473, pp 11-137.
NIPS2004 - The following papers were accepted for NIPS 2004:
Hui Zou, Trevor Hastie, and Rob Tibshirani.
Principal Component Analysis. We present a new approach to
principal component analysis, that allows us to use an L1 penalty to
ensure sparseness of the loadings. Published in JCGS 2006 15(2):
262-286. Software is available in R package elasticnet
available from CRAN.
Trevor Hastie, Saharon Rosset, Rob Tibshirani and Ji Zhu.
Entire Regularization Path for the Support Vector Machine.
JMLR, 5(Oct) 1391-1415.
An algorithm for computing the two-class SVM solution for all possible
values of the regularization parameter C, at essentially the
of a single SVM fit. Not only does this allow for efficient model
selection, but it also exposes the role of regularization for SVMs.
MPEG movies show the sequence of solutions for different examples.
software package for R.
Trevor Hastie and Robert Tibshirani.
Efficient Quadratic Regularization for Expression
Arrays. Biostatistics (2004), 5(3), pp 329-340. Computational tricks for a large class of linear
models fit by quadratic regularization.
Hui Zou and Trevor Hastie.
and Variable Selection via the Elastic Net (pdf). JRSSB (2005)
67(2) 301-320. A compromise between ridge regression and the lasso,
with the computational advantages of the lasso. The elastic net
selects variables in correlated sets. An R package elasticnet
is available from CRAN.
on Essential Science Indicators.
Jerome Friedman, Trevor
Hastie, Saharon Rosset, Rob Tibshirani and Ji Zhu.
of three Boosting papers Annals of
Statistics, 2004, vol 32 (1) pp 102-107. The three papers are by
(1) Wenxin Jiang, (2) Gabor Lugosi and Nicolas Vayatis, and (3)
Saharon Rosset, Ji Zhu and Trevor Hastie.
Margin Maximizing Loss Functions
(accepted poster Nips 2003)
Ji Zhu, Saharon Rosset, Trevor Hastie and Rob Tibshirani.
1-Norm Support Vector Machines
(accepted spotlight poster Nips 2003)
- Mu Zhu, Trevor Hastie and Guenther Walther.
Constrained Ordination Analysis with Flexible Response
Functions Constrained ordination via nonparametric
discriminant analysis. Ecological Modelling (2005),
Francesca Dominici, Aidan McDermott and Trevor Hastie.
Improved Semiparametric Time Series Models of Air Pollution
and Mortality [pdf Technical Report] JASA, December 2004, 99(468), 938-948.
- Ji Zhu and Trevor Hastie
Classification of Gene Microarrays by Penalized Logistic
Regression. Biostatistics 5(3):427-443.
Code (from Ji Zhu's website)
Trevor Hastie and Robert Tibshirani.
Independent Component Analysis through
Product Density Estimation (ps file). A direct statistical approach to
ICA, using an attractive spline representation to model each of
the marginal densities.A more recent (Nov 2002)
Saharon Rosset, Ji Zhu and Trevor Hastie.
Boosting as a Regularized Path to a
Maximum Margin Classifier (pdf file) JMLR 5 (Aug 2004): 941--973, 2004. We show that a
version of boosting fits a model by optimizing a
L1-penalized loss function. This in turn shows that the
corresponding versions of Adaboost and Logitboost converge
to an "L1" optimal separating hyperplane.
- Support Vector Machines, Kernel Logistic
Regression and Boosting . Slides for talk given at Spring
research conference in Michigan, MCS2002 in Sardinia,
NPCONF2002 in Crete, and ASA2002 in New York.
Hongjuan Zhao, Trevor Hastie, Dr Michael L Whitfield, Prof Anne-Lise
Borresen-Dale and Dr. Stefanie S Jeffrey
and evaluation of T7 based RNA linear amplification
protocols for cDNA microarray analysis
2002, 3:31 (30 Oct 2002)
Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu.
Class prediction by
nearest shrunken centroids, with applications
to DNA microarrays
This is a more statistical version of the PNAS paper below.
Rob Tibshirani, Trevor Hastie, B. Narashiman and Gilbert Chu:
"Diagnosis of multiple cancer types by shrunken centroids of gene
expression" (PNAS website). PNAS 2002 99:6567-6572 (May
14). See PAM
website for software (available soon).
Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani,
Least Angle Regression
Annals of Statistics (with discussion) (2004) 32(2), 407-499. A new method for variable subset selection, with the lasso and "epsilon" forward stagewise methods as special cases.
LARS Software for R and Splus.
Antoine Guisan, Thomas Edwards and Trevor Hastie
linear and generalized additive models in studies of species
distributions: setting the scene. Ecological Modeling
(2002) 157, 89-100.
Trevor Hastie, Robert Tibshirani and Jerome Friedman,
of Statistical Learning: Data Mining, Inference and
Springer-Verlag, New York.
Mu Zhu and Trevor Hastie,
"Feature extraction for
non-parametric discriminant analysis" JCGS (2003, 12(1), pages 101-120.
Ji Zhu and Trevor Hastie,
"Kernel Logistic Regression and the Import Vector Machine", (NIPS, 2001; JCGS 2005). Copy of
slides(pdf) presented by TH in
Kyoto in December, 2001.
R code for fitting IVM models.
Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan,
Michael Eisen, Gavin Sherlock, Pat Brown, and David Botstein
Exploratory screening of genes and
clusters from microarray experiments (ps file) or
Perou, C., Robert Tibshirani,
Turid Aas, Stephanie Geisler,
Hilde Johnsen, Trevor Hastie,
Michael B. Eisen, Matt van de
Rijn, Stefanie S. Jeffrey,
Thor Thorsen, Hanne Quist,
John C. Matese, Patrick O.
Brown, David Botstein, Per
Eystein Lonninngg, and Anne-Lise Borresen-Dale.
Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical
implications. PNAS 98: 10869-10874.
Trevor Hastie, Robert Tibshirani, David Botstein and Pat Brown,
"Supervised Harvesting of Expression Trees" (postscript) .
Starting from a hierarchically clustered expression array, we build a
predictive model for an outcome variable using cluster nodes as inputs.
Tech. report. August 2000.
Olga Troyanskaya, Michael Cantor, Gavin Sherlock,
Pat Brown, Trevor Hastie, Robert Tibshirani, David Botstein
and Russ B. Altman,
Missing value estimation methods for DNA
Vol. 17 no. 6, 2001
Eva Cantoni and Trevor Hastie "Degrees-of-Freedom Tests for Smoothing
Splines." Tech Report, May 2000.
A mixed-effects framework for smoothing splines and additive models
allows for exact tests between nested models of different complexity.
The complexity is calibrated via the effective degrees of freedom.
Thomas Yee and Trevor Hastie.
Reduced Rank Vector Generalized Linear
Models (2003) Statistical Modeling, 3, pages 15-41. Using the multinomial
as a primary example, we propose reduced rank logit models for
discrimination and classification. This is a conditional version
of the reduced rank model of linear discriminant analysis.
Robert Tibshirani, Guenther Walther and Trevor Hastie.
"Estimating the number of clusters in a dataset via the Gap statistic".
Journal of the Royal Statistical Society, B, 63:411-423,2001.
Modeling and Tracking of Human Motion, a joint project with
Dirk Ormoneit and Michael Black's group at Xerox Parc, with
motion graphics demonstrations of learned walking
Page 50 of "Generalized Additive Models" by Hastie and Tibshirani,
1990, Chapman and Hall. Some copies of the 1999 printing by CRC Press
replaced page 50 with a page from a history text!
page50.ps or page50.pdf
Trevor Hastie, Laura Bachrach, Balasubramanian Narasimhan and May Choo
Wang. Flexible Statistical Models for Growth Fragments: a Study of
Bone Mineral Acquisition Compare your own measurements using our
Gareth James and Trevor Hastie Functional Linear Discriminant
Analysis for Irregularly Sampled Curves (2001) Journal of the Royal
Statistical Society, Series B JRSS B 63, 533-550.
Trevor Hastie, Robert Tibshirani, Michael B Eisen, Ash
Alizadeh, Ronald Levy, Louis Staudt, Wing C Chan, David Botstein,
`Gene shaving' as a method for identifying distinct sets of genes
with similar expression patterns This is an online version
of the paper, published in the
online journal GenomeBiology.
Trevor Hastie, Robert Tibshirani, Michael Eisen, Pat Brown, Doug Ross, Uwe Scherf, John Weinstein, Ash Alizadeh, Louis Staudt, David Botstein
"Gene Shaving: a New Class of Clustering Methods for Expression
Postscript (2.9mb) or
Adobe pdf (5.4mb) Tech. report. Jan 2000.
Hastie, T., and Sugar, C. "A
Principal Component Models for Sparse Functional Data".
(2000, Biometrika, 87, 587-602) (pdf). When the data are collections of sampled curves or
images, functional principal components produce the principal
modes of variation. Here we generalize these
procedures to deal with the case when each curve is sparsely and
Hastie, T., Tibshirani, R., Sherlock, G., Eisen, M., Brown, P.
and Botstein, D. "Imputing Missing
Data for Gene Expression Arrays". Technical report (1999),
Stanford Statistics Department.
pdf (145Kb) or
Tibshirani, R., Hastie, T. Eisen, M., Ross, D. , Botstein, D.
and Brown, P. "Clustering methods for the analysis of DNA microarray data".
Postscript (4.8mb) or
Compressed Postscript (1.8mb)
Tech. report Oct. 1999.
- D. Ormoneit and T. Hastie.
Optimal kernel shapes for local linear regression.
In S. A. Solla, T. K. Leen, and K-R. Müller, editors, Advances
in Neural Information Processing Systems 12. The MIT Press, 2000.
Tibshirani, R. and Lazzeroni, L. and
Hastie, T. and Olshen, A. and Cox, D.R.
Global Pairwise Approach to Radiation Hydrid Mapping".
Technical Report January 1999. Using data of co-occurrence of
hybridized markers after shattering, inference is made of the marker
sequence in the chromosome.
Friedman, J., Hastie, T. and Tibshirani, R. (Published version)
Additive Logistic Regression:
a Statistical View of Boosting Annals of
Statistics 28(2), 337-407. (with
We show that boosting fits an additive logistic regression model
by stagewise optimization of a criterion very similar to the
log-likelihood, and present likelihood based alternatives. We
also propose a multi-logit boosting procedure which appears to have
advantages over other methods proposed so far.
Here are the slides (2 per page) for my
Crellin, N., Hastie, T. and Johnstone, I.
"Statistical Models for Image
Sequences" Technical report, submitted to "Human Brain Mapping".
We study fMRI sequences of the human brain obtained from
experiments involving repetitive neuronal activity. We investigate the
function form of
the hemodynamic response function, and provide evidence that
the commonly adopted convolution model is inadequate.
Hastie, T. and Tibshirani, R.
"Bayesian Backfitting" Stanford Technical report.
The Gibbs sampler looks and feels like the backfitting algorithm
for fitting additive models. Indeed, a simple modification to
backfitting turns it into a Gibbs sampler for spitting out
samples from the "posterior" distribution for an additive fit.
Published Statistical Science 15, no. 3 (2000), 196-223
Wu, T.,Hastie, T., Schmidler, S. and Brutlag, D.
"Regression Analysis of Multiple
Protein Structures" Models for lining up and averaging
groups of protein structures.
Rubinstein, D. and Hastie, T. "Discriminative vs Informative Learning" A comparison of two
frequently used but different paradigms for training classifiers.
Maes, S. and Hastie, T.
"Dynamic Mixtures of Splines: a Model for
Saliency Grouping in the Time Frequency Plane"
This is an application of mixture modeling to speech data. We
use a moving mixture of Gaussians to represent the
formant-frequencies in speech data.
Hastie, T., and Tibshirani, R. and Buja, A.
"Flexible Discriminant and Mixture Models"
In edited proceedings of "Neural Networks and
Statistics" conference, Edinburgh, 1995. J. Kay and
D. Titterington, Eds. Oxford University Press.
Wu, T., Schmidler, S., Hastie, T., and Brutlag, D.
"Modelling and superposition of
structures using affine transformations: analysis of the
James, G., and Hastie, T.
"Generalizations of the Bias/Variance
Decomposition for Prediction Error".
Several papers have recently appeared on this topic, and each
have a different viewpoint and decomposition. We hope ours does
not add to the confusion.
James, G., and Hastie, T.
Error Coding and PaCTs.
Gareth James' winning paper in the ASA student paper competition
for the Statistical Computing Section. This is one of four winning papers.