Rosenberg lab at Stanford University

We are a mathematical, theoretical, and computational lab in genetics and evolution. Research in the lab addresses problems in evolutionary biology and human genetics through a combination of mathematical modeling, computer simulations, development of statistical methods, and inference from population-genetic data. Read more...


  • 10-1-2019 — A new study by Rohan Mehta computes probabilities under the coalescent model of reciprocal monophyly for sets of gene lineages from three and four species. The computation extends an earlier computation that permitted only two sets of lineages [141]. The study appears in a special issue of Theoretical Population Biology celebrating Marc Feldman's 75th birthday.

  • 9-23-2019Nicolas Alcala studies the coalescent theory of all possible symmetric migration models involving at most four demes. His paper examines coalescent quantities such as the time to the most recent common ancestor under the models, determining how these quantities relate to network properties such as the mean number of edges per vertex and the density of edges. The study introduces a network perspective for coalescent models — applying it to empirical examples on tigers and birds of genus Sholicola in India. PhD graduate Amy Goldberg also contributed to the project.

  • 9-9-2019 — A new paper led by Rohan Mehta examines the behavior of the FST measure of genetic differentiation on haplotypic data. The study illustrates how incrementing the length of the haplotype window tends to decrease FST — but sometimes increases it. The work is closely related to several of the lab's papers on FST [102] [121] [149] [165]. Check out the video abstract drawn and narrated by co-author Alison Feder.

  • 5-8-2019 — In a collaboration with the Stanford Conservation Program, we have developed a stochastic population occupancy model to examine two decades of occupancy data from the campus populations of the California red-legged frog (Rana draytonii). The model seeks to explain population declines of R. draytonii in campus creeks and suggests conservation management approaches for reversing these declines. The study was led by Nicolas Alcala.

  • 5-2-2019 — A new study led by Alissa Severson examines the relationship between runs of homozygosity and identity-by-descent tracts. The paper determines for a diploid coalescent model the time to the most recent common ancestor, both for two haplotypes in the same individual and for two haplotypes in different individuals. The work provides theory that builds on empirical observations in an earlier study [144].

  • 4-29-2019Nicolas Alcala has a new study of mathematical bounds on three population-genetic statistics: GST', Jost's D, and FST. He shows that for biallelic markers whose mean frequency across a set of populations is fixed, these three statistics achieve their maximal values at the same configuration of allele frequencies across populations. The results extend Nicolas's earlier work on FST bounds as well as that of two other studies from the lab concerning bounds on FST [102] [121].

  • 3-26-2019Filippo Disanto reports a study of the enumeration of compact coalescent histories for matching gene trees and species trees. Compact coalescent histories represent a combinatorial structure that collapses standard coalescent histories into a smaller number of equivalence classes. The study extends the lab's work on enumeration of coalescent histories to a new structure.

  • 3-3-2019 — A new paper discusses challenges of interpreting differences in polygenic scores across populations. The paper builds from the models developed by Ph.D. graduate Doc Edge for analyzing the relationship between the magnitude of genetic and phenotypic differences among populations [129] [132].

  • 1-23-2019 — Two papers from the lab appear in a special issue of Bulletin of Mathematical Biology on Algebraic Methods in Phylogenetics.
    • Jaehee Kim, Filippo Disanto, and Naama Kopelman report a study of the properties of the neighbor-joining algorithm when applied to data from admixed populations. The study shows that tree properties conjectured by Kopelman et al. [99] do not necessarily hold for every distance matrix, but they do hold much more frequently than in a null model without an admixed taxon.

    • Filippo Disanto examines the number of nonequivalent ancestral configurations for matching gene trees and species trees. Nonequivalent ancestral configurations at first appear to be less numerous than ancestral configurations without applying the equivalence relation — studied previously by Filippo [152]. Here, Filippo shows that asymptotic growth for nonequivalent configurations is also exponential.
    This pair of studies extends the lab's work on theory of admixture and combinatorics of evolutionary trees.

  • 10-31-2018 — Congratulations to Ilana Arbisser on defending her thesis "Mathematical investigations into fundamental population genetics statistics and models." Ilana's thesis examines the joint distribution of the height and length of coalescent trees, the relationship of the population-genetic statistic FST to the triangle inequality, and the state space of a discrete-state coalescent model with recombination and migration. Dr. Arbisser's wise words about making hard decisions, such as square root transformation vs. Cailliez constant in multidimensional scaling: "it's important to consider what choices we're making and the consequences of those choices." Congrats Ilana!

  • 10-18-2018 — The lab examines the potential for determining that relatives genotyped with nonoverlapping marker sets are in fact relatives. This analysis demonstrates that people typed with microsatellites used in forensic genetics can be connected to close relatives typed with single-nucleotide polymorphisms used in biomedical, genealogical, personal-genomic, and population-genetic studies. Lead author is Jaehee Kim with former lab members Doc Edge and Bridget Algee-Hewitt also contributing. [CNN] [Nature] [New Scientist] [Science] [Scientific American] [Stanford Report] [Wired]

  • 9-14-2018 — A new study characterizes all (gene tree, species tree) pairs with exactly one coalescent history. The characterization of these "lonely" pairs relies on the way in which the taxa contained in cherries of the gene tree are placed with respect to the root of the species tree.

  • 8-30-2018Rohan Mehta has defended his PhD thesis "Mathematical modeling of genetic and cultural traits." Rohan's thesis studies a variety of mathematical modeling problems in diverse areas of population genetics and evolutionary biology. He examines combinatorial and probabilistic aspects of genealogical lineages along the branches of species trees, mathematical properties of FST statistics in relation to homozygosities and haploypes, and a gene-culture coevolutionary model of health-related behaviors. Congrats Rohan!

  • 8-30-2018 — Three articles from the lab have recently appeared.
    • Alan Aw reports a study in the Journal of Mathematical Biology, on the bounds on homozygosity and entropy statistics that measure genetic diversity in terms of the frequency of the most frequent allele. Alan uses the theory of majorization to obtain the bounds, generalizing previous mathematical results from the lab [52] [87].

    • Amy Goldberg and Lawrence Uricchio report an overview of the literature on natural selection in human populations in an Oxford Bibliographies article.

    • A commentary on Anthony Edwards's 2003 essay of multivariate classification of individuals into populations on the basis of genetic markers appears in a new book edited by Rasmus Winther about Edwards's career and contributions. Among other topics, the commentary discusses the influence of Edwards's model on a phenotypic model from the lab [129].

  • 7-18-2018Ilana Arbisser reports a mathematical investigation of the relationship between two of the most frequently used features of gene genealogies, the height and length of coalescent trees. The study also includes simulations describing the effect of population growth and population subdivision on the relationship between tree height and tree length. PhD graduate Ethan Jewett contributed to the project.

  • 7-11-2018 — Check out the Stanford X-Tree Project! The lab visualizes concepts in phylogenetics using photos of trees on the Stanford campus.

  • 5-30-2018 — We congratulate Jonathan Kang on the defense of his PhD thesis "Analysis and application of linkage disequilibrium in population and statistical genetics." In his thesis, Jonathan focuses on three questions concerning linkage disequilibrium (LD) and genomic sharing: the identification of optimal subsamples to prioritize for sequencing in order to enhance LD-based imputation, the relationship of runs of homozygosity to consanguinity in Jewish populations, and mathematical properties of measures of LD. Congrats to Jon!

  • 9-14-2017 — The lab reports a study of consanguinity and runs of homozygosity in Jewish populations. PhD student Jonathan Kang compares runs of homozygosity in contemporary Jewish populations to estimates of consanguinity measured in the 1950s from interviews with mothers in maternity wards. The study finds that the demographic consanguinity rates predict the fraction of the genome that resides in long runs of homozygosity. PhD graduates Amy Goldberg and Doc Edge contributed to the study.

  • 9-8-2017Filippo Disanto reports a study of the number of ancestral configurations possessed by matching gene trees and species trees. Ancestral configurations represent a combinatorial structure useful in producing probability formulas for gene trees given species trees, and they are hence connected to coalescent histories. Filippo's work is the latest in his series of combinatorial enumerations of structures that arise in the study of gene trees and species trees [123] [135] [142].

  • 8-21-2017 — A new study from the lab shows that in the admixed population of Cape Verde, genetic admixture is correlated with a measure of linguistic admixture evaluated by tabulating words of Portuguese and African origin in individuals' speech in the Cape Verde Kriolu lnaguage. The analyses suggest a mechanism of cotransmission of genetic and linguistic admixture during the descent of a creole-speaking admixed population. We congratulate lab alumni Paul Verdu*, Ethan Jewett*, and Trevor Pemberton on the study.

  • 8-5-2017Olga Kamneva reports a phylogenetic study of 20 worldwide species of strawberries (Fragaria) on the basis of next-generation sequencing data assembled via a bioinformatics pipeline designed specifically for polyploids of mixed ploidy. The study suggests new hypotheses for the diploid progenitors of polyploid species of Fragaria.

  • 7-6-2017Nicolas Alcala has obtained mathematical bounds on population-genetic statistic FST in the case of a biallelic marker whose mean frequency across a set of populations is fixed. His bounds provide an explanation of a frequently observed dependence of FST on the number of populations under consideration. Nicolas's paper expands on two earlier studies from the lab concerning bounds on FST [102] [121].

  • 5-30-2017 — Recent PhD graduate Doc Edge reports in a new study that on the basis of correlations between genotypes at neighboring markers, profiles containing nonoverlapping sets of genetic markers can be connected to the same individual. This "record-matching" is demonstrated using genomic and forensic genetic markers, and it has implications for forensic genetics and genomic privacy. Postdoc Bridget Algee-Hewitt and former postdoc Trevor Pemberton contributed to the project. [Stanford Report news story]

  • 5-8-2017 — Congratulations to Amy Goldberg on the defense of her thesis "Mathematical and statistical approaches to elucidate recent human evolutionary history." Amy's thesis considers mechanistic mathematical models of admixture, including the effect of sex-biased admixture on autosomes and on the X chromosome, and the inference from ancient autosomes and X chromosomes of sex-biased migration during prehistoric admixture events in Europe. She also examines the human population size history of South America on the basis of the density and location of archaeological sites. Congratulations Amy!

  • 4-24-2017 — We congratulate Amy Goldberg on receiving the 2017 Sherwood Washburn Prize from the American Association of Physical Anthropologists! This prize recognizes the best student presentation at the AAPA annual meeting. Amy spoke about her work on the contrast between Neolithic and Bronze Age migrations in Europe in their levels of male and female migration. [Read the paper]

  • 3-15-2017Amy Goldberg reports that two ancient migration events in Europe involved different proportions of male and female migrants, the earlier Neolithic migration from Anatolia having similar numbers of males and females and the later Pontic-Caspian migration having a greater proportion of males. The result, relying on comparisons of ancient DNA patterns from the X chromosome and the autosomes, builds on Amy's earlier work on sex bias in genetic admixture models [122] [133]. [Science news story]

  • 3-10-2017 — A new simulation study by postdoc alum Olga Kamneva evaluates the behavior of several methods for inferring species networks when the evolutionary process includes hybridization. The paper provides much-needed information on the comparative performance of the various approaches.

  • 2-27-2017 — Postdoc alum Olga Kamneva reports in PLoS Computational Biology a study of the relationship between genome composition of microbes and the co-occurrence of microbes in the environment. She finds that comparisons of microbial genomes can contribute to predictions about whether microbes are associated ecologically. Congrats Olga!

  • 2-22-2017 — We wish several members of the lab well in their new positions.
    • Nicolas Alcala — Postdoc with Matthieu Foll, International Agency for Research on Cancer, World Health Organization, Lyon.
    • Filippo Disanto — Junior faculty, Department of Mathematics, University of Pisa (sponsored by the Rita Levi Montalcini researcher program).
    • Doc Edge — Postdoc with Graham Coop, Department of Evolution and Ecology, University of California, Davis.
    • Olga Kamneva — Bioinformatics Scientist, Affymetrix, Inc.

  • 11-14-2016 — Postdoc Lawrence Uricchio has reported an upper bound on the size of gene tree sets required before all splits of a species tree appear in a gene tree set with a specified probability. His upper bound depends on a single parameter — the shortest internal branch in the species tree. The computation extends the lab's work on methods for species tree inference from gene trees.

  • 10-14-2016 — Recent PhD graduate Doc Edge has devised a general mathematical model to understand how genotypic differences between populations contribute to phenotypic differences between populations. He uses the model to analyze the relationship of genetics to "health disparities," concluding that health disparities that all trend in the same direction are incompatible with neutral genetic explanations. The work extends a simpler model of Doc's [129], allowing for diploidy, genetic drift, and general distributions of allele frequencies.

  • 10-7-2016 — Postdoc Filippo Disanto continues the lab's work on coalescent histories with a study of the number of coalescent histories for matching gene trees in caterpillar-like families of species trees. Filippo's work solves an open problem from earlier work in the lab [111], showing that the number of coalescent histories is asymptotic to a constant multiple of the Catalan numbers. He uses clever iterative enumerations and techniques of analytic combinatorics to obtain the result. See also [41], [68], and [135] for related work.

  • 7-27-2016 — We are pleased to announce that the software MONOPHYLER is now available. MONOPHYLER computes probabilities that sets of lineages are monophyletic, both for general species trees and for trees of small size. MONOPHYLER is reported by PhD student Rohan Mehta. The software encodes formulas from Rohan's recent Proceedings of the National Academy of Sciences paper.

  • 7-22-2016 — We congratulate PhD student Doc Edge on his thesis defense, "Pick up the pieces: combining information from multiple genetic loci." Doc's thesis examines several problems in the mathematical modeling of the genotype-to-phenotype relationship in structured populations, mathematical properties of the Fst measure of genetic differentiation, and population-genetic aspects of forensic DNA testing and genetic association studies. Doc has been recognized with the Samuel Karlin Prize in Mathematical Biology, awarded by the Department of Biology. Congratulations Doc!

  • 7-19-2016 — PhD student Rohan Mehta reports a computation of the probability that a set of gene lineages on an arbitrary species tree. The work generalizes earlier studies from the lab that considered trees of only two or three species. Rohan illustrates the new formula with an application in maize. The study is a contribution to the Comparative Phylogeography volume of the "In the Light of Evolution" special issue series of Proceedings of the National Academy of Sciences USA.

  • 6-27-2016 — We congratulate biology MS student Brian Donovan on the completion of his PhD in science education "An experimental exploration of how text-based instruction in school biology affects belief in genetic essentialism of race in adolescent populations." Brian defended his PhD in the Graduate School of Education on May 26. He is continuing his studies as a postdoctoral fellow at the Biological Sciences Curriculum Study in Colorado Springs.

  • 6-17-2016 — The lab reports a study examining the predicted distribution of gene tree shape under a birth-death model of species divergence. The work suggests that gene trees are expected to be more imbalanced than species trees, potentially providing part of the explanation for an excess of imbalance observed in inferred phylogenies.

  • 6-15-2016 — Congratulations to Amy Goldberg and Jaehee Kim, who have received fellowships for 2016-2017 from the Stanford Center for Computational, Evolutionary, and Human Genomics!

  • 5-12-2016 — Lab alumnus Mike DeGiorgio reports on the consistency properties of species tree inference methods in a model with ancestral population structure. By introducing a model that includes population subdivision in ancestral species, his paper introduces a new direction for studying consistency in species tree inference. The work is related to several recent papers from the lab on consistency of species tree methods ([85], [88], [89], [97], [109])

  • 4-22-2016 — Several projects from the lab have been in the news:

  • 4-5-2016 — We congratulate PhD student Amy Goldberg on the publication of her Nature article entitled "Post-invasion demography of prehistoric humans in South America." In this work, Amy and her colleagues use the locations and dates of South American archaeological sites to estimate the time trajectory of the human population size history of the continent. Read the news story here.

  • 4-4-2016 — Lab members Bridget Algee-Hewitt, Doc Edge, and Jaehee Kim report that forensic genetic markers selected for their use in individual identification possess a surprising level of information about genetic ancestry. Moreover, their study finds that a general correlation holds for genetic markers between their information about individual identity and ancestry information. The result makes use of theory from the lab on the connection between measures of genetic diversity and genetic differentiation ([102], [121]).

  • 1-5-2016 — The lab helps celebrate the centennial of the journal Genetics!

    When PhD student Amy Goldberg develops a model for sex-biased admixture on the X-chromosome, a curious mathematical sequence leads to an unexpected connection deep in the Genetics archive.

    Read about the oscillatory functions and coupled recursions encountered in this scholarly adventure — with a surprise appearance of the Fibonacci numbers.

  • Past news items


    IM Arbisser, EM Jewett, NA Rosenberg (2018) On the joint distribution of tree height and tree length under the coalescent. Theoretical Population Biology 122: 46-56. [Abstract] [PDF]

    N Alcala, NA Rosenberg (2017) Mathematical constraints on FST: biallelic markers in arbitrarily many populations. Genetics 206: 1581-1600. [Abstract] [PDF] [File S1] [File S2]

    MD Edge, BFB Algee-Hewitt, TJ Pemberton, JZ Li, NA Rosenberg (2017) Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets. Proceedings of the National Academy of Sciences USA 114: 5671-5676. [Abstract] [PDF] [Supplement]

    JTL Kang, A Goldberg, MD Edge, DM Behar, NA Rosenberg (2016) Consanguinity rates predict long runs of homozygosity in Jewish populations. Human Heredity 82: 87-102. [Abstract] [PDF]

    RS Mehta, D Bryant, NA Rosenberg (2016) The probability of monophyly of a sample of gene lineages on a species tree. Proceedings of the National Academy of Sciences 113: 8002-8009. [Abstract] [PDF] [Supplement] [Software]

    F Disanto, NA Rosenberg (2015) Coalescent histories for lodgepole species trees. Journal of Computational Biology 22: 918-929. [Abstract] [PDF]

    A Goldberg, NA Rosenberg (2015) Beyond 2/3 and 1/3: the complex signatures of sex-biased admixture on the X chromosome. Genetics 201: 263-279. [Abstract] [PDF]

    NA Rosenberg, JTL Kang (2015) Genetic diversity and societally important disparities. Genetics 201: 1-12. [Abstract] [PDF] [Supplement]