User Tools

Site Tools


pgd

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pgd [2011/10/11 15:46] heidipgd [2016/02/22 15:41] (current) heidi
Line 1: Line 1:
 ====== PGD - Population Genetics Data format ====== ====== PGD - Population Genetics Data format ======
-Version 1.0\\+Version 1.1\\ 
 +\\ 
 +PGD (Population Genetics Data) is a file format designed to contain population genetics data. The aim of this format is to facilitate the transfer among several population genetics software packages. PGD plays an important role in the new data format converter PGDSpider.\\ 
 +\\ 
 +PGD is written in XML and is therefore independent of any particular computer system and extensible for future needs. The XML structure can easily be processed by computer programs. An additional XSLT style sheet makes it possible to display the data in an understandable and comprehensive way. This XSLT style sheet is delivered within the PGDSpider download.\\ 
 +\\ 
 +The PGDSpider distribution also includes an XML Schema (PGD_schema.xsd), which defines the structure of the PGD file. The purpose of an XML Schema is to define the legal building blocks of an XML document and the allowable contents (W3Schools, 2008). The provided XML Schema can be used to validate a PGD file.
 \\ \\
-PGD (Population Genetics Data) is a file format designed to contain population genetics data. The aim of this format is to facilitate the transfer among several population genetics software packages. PGD plays an important role in the new data format converter PGDSpider. 
-PGD is written in XML and is therefore independent of any particular computer system and extensible for future needs. The XML structure can easily be processed by computer programs. An additional XSLT style sheet makes it possible to display the data in an understandable and comprehensive way. This XSLT style sheet is delivered within the PGDSpider download. 
- 
 \\ \\
  
Line 108: Line 111:
     * Value: Integer     * Value: Integer
     * Gives the length of the locus in number of bases     * Gives the length of the locus in number of bases
 +  * <locusAncestralState> (optional):
 +    * Value: String
 +    * Gives the ancestral state of the locus
   * <locusLinks> (optional):   * <locusLinks> (optional):
     * Value: String     * Value: String
Line 139: Line 145:
     * Defines the names of the loci in the data for this population, separated by comma     * Defines the names of the loci in the data for this population, separated by comma
     * The loci have to be of the same type      * The loci have to be of the same type 
-    * If the data are nucleotide sequence data, just include one locus in this tag (if more than one locus exist one has to repeat the whole <ind> tag for every locus and specify the locus name in the <indLoci> tag). For example: <code> 
-<ind name="1"> 
- <indLoci> loci one </indLoci> 
- <data> GACTCTCTACGTAGCATCCGATGACGATA </data> 
- <data> GACTCTCTACGTAGCATCCGATGACGATA </data> 
-</ind> 
-     
-<ind name="1"> 
- <indLoci> loci two </indLoci> 
- <data> GACTCTCTACGTAGCATCCGATGACGATA </data> 
- <data> GACTCTCTACGTAGCATCCGATGACGATA </data> 
-</ind> 
-</code> 
- 
   * <ind> with attribute “name=” (mandatory):   * <ind> with attribute “name=” (mandatory):
     * Defines the different individuals in this population     * Defines the different individuals in this population
Line 169: Line 161:
     * Only if the data are of different data types in this population     * Only if the data are of different data types in this population
     * Defines the loci names of the data with the same data type in this individual separated by “,”     * Defines the loci names of the data with the same data type in this individual separated by “,”
-    * The loci must be of the same data type and all loci of the same data type have to be included (only one loci tag allowed per data type, except for nucleotide sequence data) +    * The loci must be of the same data type
-    * If the data are nucleotide sequence data, just include one loci in this tag (if more than one loci exist you have to repeat the whole ind tag for every loci)+
   * <indPloidy> (optional):   * <indPloidy> (optional):
     * Value: Integer     * Value: Integer
Line 242: Line 233:
     * only one per file      * only one per file
 \\  \\
-  * Microsat data are number of repeats  +  * Microsat data are number of repeats
-  * Only one loci tag is allowed per data type, i.e. all loci of the same data type have to be in the same tag  +
-  * Nucleotide sequence data: just one locus in one tag (popLoci/ indLoci tag). If more than one loci exist repeat the whole ind tag for every loci +
   * distanceMatrix: lower triangle with diagonal    * distanceMatrix: lower triangle with diagonal
 \\ \\
 +
 +
 +
  
  
Line 276: Line 268:
 | || locusGenic //(--> coding/noncoding)// ||  o  |  o  | | | || locusGenic //(--> coding/noncoding)// ||  o  |  o  | |
 | || locusLength ||  o  |  o  | | | || locusLength ||  o  |  o  | |
 +| || locusAncestralState ||  o  |  o  | |
 | || locusLinks //(--> URL)// ||  o  |  o  | | | || locusLinks //(--> URL)// ||  o  |  o  | |
 | || locusComments ||  o  |  o  | | | || locusComments ||  o  |  o  | |
Line 290: Line 283:
 | || indFreq //(absolute Freq)// ||  o  |  o  |  x  | | || indFreq //(absolute Freq)// ||  o  |  o  |  x  |
 | || data *<sup>5</sup> //(locus data, locus data, ...)// ||  x  |  x  | | | || data *<sup>5</sup> //(locus data, locus data, ...)// ||  x  |  x  | |
-| || read *<sup>6</sup> | start *<sup>6</sup> | |  x   | |+| || read *<sup>6</sup> (attribute: id) | start *<sup>6</sup> | |  x   | |
 | ||| length *<sup>6</sup> | |  o  | | | ||| length *<sup>6</sup> | |  o  | |
 | ||| data *<sup>6</sup> | |  x  | | | ||| data *<sup>6</sup> | |  x  | |
pgd.1318340773.txt.gz · Last modified: 2011/10/11 15:46 (external edit)