structure
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
structure [2007/12/07 10:03] – heidi | structure [2008/07/22 13:31] – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== STRUCTURE ====== | ====== STRUCTURE ====== | ||
- | {{structure.jpg? | + | {{structure.jpg? |
\\ | \\ | ||
- | **[[http:// | + | **[[http:// |
[[http:// | [[http:// | ||
Line 16: | Line 16: | ||
* Windows | * Windows | ||
* Mac OS X | * Mac OS X | ||
+ | |||
===== Data type handled ===== | ===== Data type handled ===== | ||
- | * SNP | + | * SNP (numeric) |
* Microsatellites | * Microsatellites | ||
* RFLP | * RFLP | ||
* AFLP | * AFLP | ||
*dipoid/ | *dipoid/ | ||
+ | |||
+ | |||
+ | |||
+ | |||
Line 32: | Line 37: | ||
The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | ||
- | \\ | ||
=== rows: === | === rows: === | ||
* **Marker Names** (Optional; string): | * **Marker Names** (Optional; string): | ||
Line 38: | Line 42: | ||
* **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | * **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | ||
* **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | * **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | ||
- | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, | + | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, |
- | - the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) | + | - the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) |
The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | ||
102 156 165 101 143 105 104 101 | 102 156 165 101 143 105 104 101 | ||
Line 56: | Line 60: | ||
* **Extra Columns** (Optional; string): It may be convenient for the user to include additional data in the input file which are ignored by the program. These go here, and may be strings of integers or characters. | * **Extra Columns** (Optional; string): It may be convenient for the user to include additional data in the input file which are ignored by the program. These go here, and may be strings of integers or characters. | ||
* **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | * **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | ||
- | |||
=== Missing genotype data: === | === Missing genotype data: === | ||
Missing data should be indicated by a number that doesn' | Missing data should be indicated by a number that doesn' | ||
+ | |||
+ | \\ | ||
+ | |||
==== example: ==== | ==== example: ==== | ||
- | example for the genotype data: | + | example for genotype data: |
< | < | ||
loc_a loc_b loc_c loc_d loc_e | loc_a loc_b loc_c loc_d loc_e |
structure.txt · Last modified: 2011/07/07 12:54 by heidi