structure
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
structure [2007/12/07 10:07] – heidi | structure [2011/07/07 12:50] – heidi | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== STRUCTURE ====== | ====== STRUCTURE ====== | ||
- | {{structure.jpg? | + | {{structure.jpg? |
\\ | \\ | ||
- | **[[http:// | + | **[[http:// |
[[http:// | [[http:// | ||
\\ | \\ | ||
- | Version 2.2 (April | + | Version 2.3.3 (January 2010)\\ |
- | The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers. | + | The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers. |
Line 16: | Line 16: | ||
* Windows | * Windows | ||
* Mac OS X | * Mac OS X | ||
+ | |||
===== Data type handled ===== | ===== Data type handled ===== | ||
- | * SNP | + | * SNP (numeric) |
* Microsatellites | * Microsatellites | ||
* RFLP | * RFLP | ||
* AFLP | * AFLP | ||
*dipoid/ | *dipoid/ | ||
+ | |||
+ | |||
Line 34: | Line 37: | ||
The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | ||
- | \\ | ||
=== rows: === | === rows: === | ||
* **Marker Names** (Optional; string): | * **Marker Names** (Optional; string): | ||
Line 40: | Line 42: | ||
* **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | * **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | ||
* **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | * **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | ||
- | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, | + | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, |
- the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) | - the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) | ||
The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | ||
Line 50: | Line 52: | ||
one or more rows. | one or more rows. | ||
- | \\ | ||
=== Individual/ | === Individual/ | ||
Each row of individual data contains the following elements. These form columns in the data file: | Each row of individual data contains the following elements. These form columns in the data file: | ||
Line 60: | Line 61: | ||
* **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | * **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | ||
- | \\ | ||
=== Missing genotype data: === | === Missing genotype data: === | ||
Missing data should be indicated by a number that doesn' | Missing data should be indicated by a number that doesn' | ||
\\ | \\ | ||
+ | |||
==== example: ==== | ==== example: ==== | ||
- | example for the genotype data: | + | example for genotype data: |
< | < | ||
loc_a loc_b loc_c loc_d loc_e | loc_a loc_b loc_c loc_d loc_e |
structure.txt · Last modified: 2011/07/07 12:54 by heidi