Structure Software

Pritchard Lab, Stanford University


The program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.

In 2016 John Novembre wrote a short historical perspective of Structure.


Download Structure 2.3.4.

fastSTRUCTURE for large SNP datasets is out now! Links to the preprint and software (beta release) by Anil, Matthew and Jonathan.

What to cite: The basic algorithm was described by Pritchard, Stephens & Donnelly (2000). Extensions to the method were published by Falush, Stephens and Pritchard (2003), and (2007) and Hubisz, Falush, Stephens and Pritchard (2009).

Contributors: Daniel Falush, Melissa Hubisz, Matthew Stephens, Jonathan Pritchard, Peter Donnelly, William Wen, Mike Trienis, Pall Melsted.

Questions and Discussion: There is a Structure discussion forum to which you can direct questions. Many thanks to Vikram Chhatre who moderates this discussion group. Bug Reports.

Plotting programs and other resources: The Structure software performs basic plotting and reporting of results. CLUMPAK by Naama Kopelman and Itay Mayrose builds on Noah Rosenberg's earlier programs CLUMPP and distruct for producing nice graphical displays of structure results, and computing useful statistics. Structure Harvester by Dent Earl provides additional tools for visualizing Structure output. Xavier Didelot's program xmfa2struct converts files in eXtended Multi-Fasta (XMFA) format into Structure input format. Maike Morrison's FSTruct package provides a tool to quantify the variability of ancestry (Q) matrices.

Genome-wide SNP data: TreeMix by Joe Pickrell and Jonathan uses large numbers of SNPs to estimate the historical relationships among populations, using a graph representation that allows both population splits and migration events. [Note: Joe's latest release now allows microsat data too.] fastSTRUCTURE by Anil Raj, Matthew and Jonathan, for running Structure on very large SNP datasets [Raj et al 2014]. fineSTRUCTURE by Daniel Lawson and colleagues enables analyses of very fine scale structure for genome-wide SNP data.

Sample data sets: available here.

Taita thrush: An example of MCMC convergence based on the original paper is shown here.

Some miscellaneous applications: structure has been widely used for interpreting population structure of humans and other organisms. A selection of interesting references (mainly applications) is shown below.

Traces of human migrations in Helicobacter pylori populations. D. Falush, T. Wirth, B. Linz, J.K. Pritchard, M. Stephens and 13 others, 2003.  Science, 299: 1582-1585. [PDF]

The genetic structure of human populations. N.A. Rosenberg, J.K. Pritchard, J.L. Weber, H.M. Cann, K.K. Kidd, L.A. Zhivotovsky and M.W. Feldman, 2002. Science, 298: 2381-2385. (and technical comment, 2003) [PDF]

Dwarf8 polymorphisms associate with variation in flowering time. Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler ES. Nat Genet. 2001 28:286-9. [PubMed Abstract]

Origin of extant domesticated sunflowers in eastern North America. Harter AV, Gardner KA, Falush D, Lentz DL, Bye RA, Rieseberg LH. Nature. 2004 430:201-5. [PubMed Abstract]

Emerging vectors in the Culex pipiens complex. Fonseca DM, Keyghobadi N, Malcolm CA, Mehmet C, Schaffner F, Mogi M, Fleischer RC, Wilkerson RC. Science. 2004 303:1535-8. [PubMed Abstract]

Empirical evaluation of genetic clustering methods using multilocus genotypes from 20 chicken breeds. Rosenberg NA et al. Genetics. 2001 159:699-713. [PubMed Abstract]