Statistical Learning and Datamining
I have a long standing interest in flexible and nonparametric
techniques for function estimation and prediction. My Ph.D thesis was
on "Principal Curves and Surfaces" (advisor Werner Stuetzle), a
nonparametric version of principal components that fits smooth curves
and low-dimensional manifolds through the "middle" of a
multi-dimensional set of points.
"Generalized Additive Models" (with Rob Tibshirani) offer a more
flexible approach to popular methods like multiple linear regression,
logistic and log-linear regression, and the Cox model. Linear functions
can be replaced by more flexible smooth functions. One can mix and
match linear terms with smooth terms, which allows a natural blend
with classical linear models.
In the same vein, I have worked in non-parametric versions of linear
discriminant analysis, mixture classification problems, and other
Other exotica include modeling human signatures, handwritten digits,
three-dimensional protein structures, human gait analysis,
and I am still looking...
In the last 10 years my colleagues and I have been drawn into the
machine learning domain, probably after the lure of neural networks.
This has led us to offer a statistical perspective on novel and
popular techniques arising outside of statistics, such as boosting and
support-vector machines. This culminated in our 2001 book "Elements of
Statistical Learning", but the interest continues.