User Tools

Site Tools


structure

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
structure [2007/12/07 10:08] heidistructure [2011/07/07 12:54] (current) heidi
Line 1: Line 1:
 ====== STRUCTURE ====== ====== STRUCTURE ======
-{{structure.jpg?600}}+{{structure.jpg?650}}
  
 \\ \\
-**[[http://pritch.bsd.uchicago.edu/structure.html|STRUCTURE]]**+**[[http://pritch.bsd.uchicago.edu/structure.html|STRUCTURE]]**\\
 [[http://pritch.bsd.uchicago.edu/software/structure22/readme.pdf|documentation]] [[http://pritch.bsd.uchicago.edu/software/structure22/readme.pdf|documentation]]
  
 \\ \\
-Version 2.2 (April 3, 2007)\\ +Version 2.3.3 (January 2010)\\ 
-The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. +The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers. It includes inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. 
  
  
Line 16: Line 16:
   * Windows   * Windows
   * Mac OS X    * Mac OS X 
 +
  
  
 ===== Data type handled ===== ===== Data type handled =====
-  * SNP+  * SNP (numeric)
   * Microsatellites   * Microsatellites
   * RFLP   * RFLP
   * AFLP   * AFLP
   *dipoid/haploid   *dipoid/haploid
 +
  
  
Line 35: Line 37:
 The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns.
  
-\\ 
 === rows: === === rows: ===
   * **Marker Names** (Optional; string):  The first row can contain a list of identifiers for each of the markers in the data set. This row contains L strings of integers or characters, where L is the number of loci.   * **Marker Names** (Optional; string):  The first row can contain a list of identifiers for each of the markers in the data set. This row contains L strings of integers or characters, where L is the number of loci.
Line 51: Line 52:
 one or more rows. one or more rows.
  
- \\ 
 === Individual/genotype data: === === Individual/genotype data: ===
 Each row of individual data contains the following elements. These form columns in the data file: Each row of individual data contains the following elements. These form columns in the data file:
Line 61: Line 61:
   * **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score).   * **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score).
  
-\\ 
 === Missing genotype data: === === Missing genotype data: ===
 Missing data should be indicated by a number that doesn't occur elsewhere in the data (often -9 by convention). The missing-data value is set along with the other parameters describing the characteristics of the data set. Missing data should be indicated by a number that doesn't occur elsewhere in the data (often -9 by convention). The missing-data value is set along with the other parameters describing the characteristics of the data set.
  
 \\ \\
 +
  
 ==== example: ==== ==== example: ====
-example for the genotype data:+example for genotype data:
 <code> <code>
             loc_a  loc_b  loc_c  loc_d  loc_e             loc_a  loc_b  loc_c  loc_d  loc_e
Line 86: Line 86:
 Carsten  2   110    145     64         92 Carsten  2   110    145     64         92
 </code> </code>
 +
 +
  
  
 ===== How to cite ===== ===== How to cite =====
 The basic algorithm :\\ The basic algorithm :\\
-Pritchard, J. K., Stephens, M., and Donnelly, P. (2000a). Inference of population structure using +  * Pritchard, J. K., Stephens, M., and Donnelly, P. (2000a). Inference of population structure using multilocus genotype data. Genetics, 155:945-959.
-multilocus genotype data. Genetics, 155:945{959.+
  
 \\ \\
 Extensions to the method:\\ Extensions to the method:\\
-Falush, D., Stephens, M., and Pritchard, J. K. (2003a). Inference of population structure: Exten- +  * Falush, D., Stephens, M., and Pritchard, J. K. (2003a). Inference of population structure: Extensions to linked loci and correlated allele frequencies. Genetics, 164:1567-1587. 
-sions to linked loci and correlated allele frequencies. Genetics, 164:1567{1587.\\ +  Falush, D., Stephens, M., and Pritchard, J. K. (2007). Inference of population structure using multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes
-and\\ +  * Hubisz M. J., et al. (2009). Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources, 9:1322-32.
-Falush, D., Stephens, M., and Pritchard, J. K. (2007). Inference of population structure using +
-multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes.+
structure.1197018537.txt.gz · Last modified: 2008/07/22 13:30 (external edit)