structure
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
structure [2007/12/07 10:05] – heidi | structure [2007/12/11 09:48] – heidi | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== STRUCTURE ====== | ====== STRUCTURE ====== | ||
- | {{structure.jpg? | + | {{structure.jpg? |
\\ | \\ | ||
- | **[[http:// | + | **[[http:// |
[[http:// | [[http:// | ||
Line 24: | Line 24: | ||
* AFLP | * AFLP | ||
*dipoid/ | *dipoid/ | ||
+ | |||
+ | |||
+ | |||
Line 33: | Line 36: | ||
The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | The entire data set is arranged as a matrix in a single file, in which the data for individuals are in rows, and the loci are in columns. For a diploid organism, data for each individual can be stored either as 2 consecutive rows, where each locus is in one column, or in one row, where each locus is in two consecutive columns. | ||
- | \\ | ||
=== rows: === | === rows: === | ||
* **Marker Names** (Optional; string): | * **Marker Names** (Optional; string): | ||
Line 39: | Line 41: | ||
* **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | * **Inter-Marker Distances** (Optional; real numbers): the next row is a set of inter-marker distances, for use with linked loci (contains L real numbers). These should be genetic distances (e.g., centiMorgans), | ||
* **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | * **Phase Information** (Optional; diploid data only; real number in the range [0,1]): This is for use with the linkage model only. A single row of L probabilities that appears after the genotype data for each individual. There are two alternative representations for the phase information: | ||
- | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, | + | - the two rows of data for an individual are assumed to correspond to the paternal and maternal contributions, |
- | - the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) | + | - the phase line indicates the probability that the phase of one allele relative to the previous allele is correct (set MARKOVPHASE=1) |
The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | The first entry should be filled in with 0.5 to fill out the line to L entries. For example the following data input would represent the information from an male with 5 unphased autosomal microsatellite loci followed by three X chromosome loci, using the maternal/ | ||
102 156 165 101 143 105 104 101 | 102 156 165 101 143 105 104 101 | ||
Line 49: | Line 51: | ||
one or more rows. | one or more rows. | ||
- | \\ | ||
=== Individual/ | === Individual/ | ||
Each row of individual data contains the following elements. These form columns in the data file: | Each row of individual data contains the following elements. These form columns in the data file: | ||
Line 59: | Line 60: | ||
* **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | * **Genotype Data** (Required; integer): Each allele at a given locus should be coded by a unique integer (eg microsatellite repeat score). | ||
- | \\ | ||
=== Missing genotype data: === | === Missing genotype data: === | ||
Missing data should be indicated by a number that doesn' | Missing data should be indicated by a number that doesn' | ||
\\ | \\ | ||
+ | |||
+ | |||
==== example: ==== | ==== example: ==== | ||
- | example for the genotype data: | + | example for genotype data: |
< | < | ||
loc_a loc_b loc_c loc_d loc_e | loc_a loc_b loc_c loc_d loc_e |
structure.txt · Last modified: 2011/07/07 12:54 by heidi