This is an old revision of the document!
Table of Contents
STRUCTURE
Version 2.2 (April 3, 2007)
The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed.
Program information
- written in C with Java front end
- UNIX
- Windows
- Mac OS X
Data type handled
- SNP
- Microsatellites
- RFLP
- AFLP
Input Files
The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns.
example:
example for the genotype data:
loc_a loc_b loc_c loc_d loc_e George 1 -9 145 66 0 92 George 1 -9 -9 64 0 94 Paula 1 106 142 68 1 92 Paula 1 106 148 64 0 94 Matthew 2 110 145 -9 0 92 Matthew 2 110 148 66 1 -9 Bob 2 108 142 64 1 94 Bob 2 -9 142 -9 0 94 Anja 1 112 142 -9 1 -9 Anja 1 114 142 66 1 94 Peter 1 -9 145 66 0 -9 Peter 1 110 145 -9 1 -9 Carsten 2 108 145 62 0 -9 Carsten 2 110 145 64 1 92
How to cite
The basic algorithm :
Pritchard, J. K., Stephens, M., and Donnelly, P. (2000a). Inference of population structure using
multilocus genotype data. Genetics, 155:945{959.
Extensions to the method:
Falush, D., Stephens, M., and Pritchard, J. K. (2003a). Inference of population structure: Exten-
sions to linked loci and correlated allele frequencies. Genetics, 164:1567{1587.
and
Falush, D., Stephens, M., and Pritchard, J. K. (2007). Inference of population structure using
multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes.