User Tools

Site Tools


baps

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
baps [2008/06/16 10:15] heidibaps [2013/02/20 13:24] (current) heidi
Line 4: Line 4:
  
 \\ \\
-Version 5.1\\+Version 5.4 (29.04.2010)\\
 A program for Bayesian inference of the genetic structure in a population. Assigns individuals to genetic clusters by either considering them as immigrants (mixture analysis) or ad descendants from immigrants (admixture analysis). A program for Bayesian inference of the genetic structure in a population. Assigns individuals to genetic clusters by either considering them as immigrants (mixture analysis) or ad descendants from immigrants (admixture analysis).
 +
 +
  
  
 ===== Program information ===== ===== Program information =====
-  * Windows XP/2000/Vista +  * Windows XP/Vista/7 (32-bit, 64-bit) 
-  * Mac OS X +  * Mac Snow leopard OS X (64-bit) 
-  * Linux+  * Linux (32-bit) 
 + 
 + 
  
  
Line 17: Line 22:
 ===== Data type handled ===== ===== Data type handled =====
   * haploid/diploid/(tetraploid)   * haploid/diploid/(tetraploid)
-  * SNP+  * DNA 
 +  * SNP (sequence/numeric)
   * AFLP   * AFLP
   * Microsatellite   * Microsatellite
-  * multi-allelic markers+  * Standard (multi-allelic markers)
  
  
Line 147: Line 153:
  
 \\ \\
 +
 +
  
  
Line 153: Line 161:
 ==== Spatial clustering: ==== ==== Spatial clustering: ====
 Same as the first two above, except for the coordinate values that need to be given in a separate file: Same as the first two above, except for the coordinate values that need to be given in a separate file:
-  * as many rows as there are individuals (spatial clustering of individuals) or groups (spatial clustering of groups) in the molecular data set.+  * as many rows as there are individuals (spatial clustering of individuals -> sampling coordinates of each individual) or groups (spatial clustering of groups -> sampling coordinates of each group) in the molecular data set.
   * missing coordinate: two consecutive zeros   * missing coordinate: two consecutive zeros
  
 \\ \\
 **example:** **example:**
-  * Data flie: see first two example +  * Data file: see first example 
-  * Coordinate file: (sampling coordinates of each group or each ind -> clustering of groups of ind or clustering of ind)<code>+  * Coordinate file: <code>
 172  88 172  88
 155  96 155  96
Line 168: Line 176:
  
 \\ \\
 +
  
  
 ==== Clustering of linked molecular data (sequence data): ==== ==== Clustering of linked molecular data (sequence data): ====
 === MLST data format: === === MLST data format: ===
 +  * for prokaryotik organism
   * first column: identifier where the numbering should go linearly from 1 to number of isolates (unique for each)   * first column: identifier where the numbering should go linearly from 1 to number of isolates (unique for each)
   * second column: unique ID label for each isolate (for printing results). The header could either be “Isolate” or “Strain”   * second column: unique ID label for each isolate (for printing results). The header could either be “Isolate” or “Strain”
   * third column (optional): provides a species or similar group name for the isolates   * third column (optional): provides a species or similar group name for the isolates
   * remaining columns: genes for which there are aligned sequences available   * remaining columns: genes for which there are aligned sequences available
 +  * if header is given: columns can be in different order
  
 **example:** <code> **example:** <code>
Line 197: Line 208:
  
 \\ \\
-=== BASP data format: ===+=== BAPS data format: ===
   * haploid marker data (single data row per individual)   * haploid marker data (single data row per individual)
   * diploid marker data (two rows per individual)   * diploid marker data (two rows per individual)
Line 203: Line 214:
  
 \\ \\
-  * numeric data input format or a direct sequence based format: +numeric data input format or a direct sequence based format: 
-    * numeric format: replacing each of A,C,G,T with a unique integer and missing values with a negative integer (-999)Individual Index after the sequence separated by a space +  * **numeric format:**  
-    * sequence formatIndividual Index after the sequence separated by space+    * replacing each of A,C,G,T with a unique integer and missing values with a negative integer (-999) 
 +    * Individual Index after the sequence separated by a space 
 +    * example: a single data row for individual 110 with sequence AACCG-T could lool like this: <code> 
 +65 65 67 67 71 -999 84 110 
 +</code>
  
-**example** (diploid): <code>+  * **sequence format:**  
 +    * Individual Index after the sequence separated by a space 
 +    * example (diploid): <code>
 ATTTGCCTACGTAGCCAATT 1 ATTTGCCTACGTAGCCAATT 1
 TTACCGACCTTAAAAACCTT 1 TTACCGACCTTAAAAACCTT 1
Line 214: Line 231:
 </code> </code>
  
-  * separate file of gene boundaries:+\\ 
 +  In contrast to the MLST format you need under the BAPS format to concatenate the sequences from all considered genes into a single one and tell the program about the gene boundaries in a separate file. Separate file of gene boundaries:
     * number of rows equals the number of genes     * number of rows equals the number of genes
     * at each row, the integers refer to those columns of the data matrix that correspond to the specific gene     * at each row, the integers refer to those columns of the data matrix that correspond to the specific gene
 +    * Additional zeros are used to fill the rows to have an equal number of colummns
  
 **example** (“linkage map”: 3 genes the first corresponding to the columns 1-10 in the data matrix and so on. Additional zeros result in a matrix having an equal number of columns for each row): <code> **example** (“linkage map”: 3 genes the first corresponding to the columns 1-10 in the data matrix and so on. Additional zeros result in a matrix having an equal number of columns for each row): <code>
Line 260: Line 279:
  
 ===== How to cite ===== ===== How to cite =====
 +Tang J, Hanage WP, Fraser C, Corander J. (2009). Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Computational Biology, 5(8): e1000455.
 +\\
 Corander, J., Waldmann, P., Marttinen, P. and Sillanpää, M.J. (2004).  BAPS 2: enhanced possibilities for the analysis of genetic population structure, Bioinformatics,  20, 2363-2369. Corander, J., Waldmann, P., Marttinen, P. and Sillanpää, M.J. (2004).  BAPS 2: enhanced possibilities for the analysis of genetic population structure, Bioinformatics,  20, 2363-2369.
  
baps.1213604101.txt.gz · Last modified: 2008/07/22 13:30 (external edit)