Alexis Battle
ajbattle at cs.stanford.edu
Gates 130
Stanford University
school:
Alexis has moved to Johns Hopkins! We are looking for comp bio students and postdocs to join the Battle Lab.
research interests:
My research focuses on the genetics of gene regulation and complex traits. We use machine learning and probabilistic models to untangle the effects of genetic variation on clinically relevant phenotypes. Working with a range of genomic data including RNA-sequencing, quantitative genetic interactions and genome-wide association studies, I am interested in constructing biologically informed models. In particular, I leverage pathways and gene-networks in identifying trait-relevant genetic factors and developing biologically interpretable models. My computational interests include graphical models, transfer learning, and structured regularization methods.
projects:
Modeling the effect of genetic variation on gene expression:
My most recent work focuses on characterizing the landscape of genetic variants that affect gene expression, as these regulatory genetic variants are believed to play an important role in common diseases. We have collected RNA-sequencing data for nearly one thousand European subjects. Combined with genotyping of each individual, we have identified thousands of novel associations between genetic variation and diverse aspects of gene expression. From this extensive catalog of associations, we have trained Bayesian latent variable models, incorporating features based on genomic annotations, to characterize and actually predict the consequences of regulatory variants. (Battle, Genome Research 2013)
Active learning for discovery of interactions in complex human traits:
Interactions, where particular combinations of genetic variants result in non-additive effects on a trait, may also play a major role in human disease. Unfortunately, many genome-wide studies are underpowered to identify any significant interactions. However, exploring patterns of interaction in three human disease traits, our analysis suggests non-additive effects are not distributed at random but rather follow predictable and biologically meaningful patterns, including enrichment between genes with known relationships. I have worked on an approach leveraging these patterns in a novel active learning method, Guided Adaptive Interaction Testing (GAIT), which automatically prioritizes candidate interactions and reduces the statistical burden of multiple hypothesis testing dramatically. GAIT identifies a large number of interactions from a variety of disease data, significantly improving our understanding of the mechanisms that drive them. (under review)
A network-based framework for identifying disease risk-variants:
Disease variants with small effects on risk are often buried among many spurious associations in genome-wide association studies (population-level studies of genetic variation in disease). However, a key observation is that multiple co-functional or pathway-connected genes often affect the same trait. We leverage this observation to improve our power to detect disease variants. We developed a flexible regression-based framework, PriorNet, which incorporates a Markov Random Field prior on gene relevance, constructed from diverse sources of gene network and pathway information. This approach results in improved identification of disease-relevant genes, particularly those with small effect sizes. (presented at MLCB 2012)
Bayesian structure learning for causal gene networks:
We developed methods to learn intricate networks describing the joint effects of hundreds of genes together on complex traits in yeast (with Jonathan Weissman, HHMI, UCSF). Recently developed interventional experimental methods have enabled large-scale measurement of quantitative genetic interactions (GI) in yeast, reporting functional dependencies among pairs of genes. With these measurements, we developed a Bayesian structure-learning method utilizing Annealed Importance Sampling, specifying a distribution over networks based on agreement with GI data. Applied to a recent proteinfolding GI dataset in yeast, our results showed that detailed multi-gene networks can be reconstructed on a large scale from GI data, providing testable hypotheses in the genetics of complex traits. (Battle MSB 2010)