Table of Contents

BAPS

BAPS
manual


Version 5.4 (29.04.2010)
A program for Bayesian inference of the genetic structure in a population. Assigns individuals to genetic clusters by either considering them as immigrants (mixture analysis) or ad descendants from immigrants (admixture analysis).

Program information

Data type handled

Input Files

Clustering of individuals:

BAPS format:


example (cluster 5 diploid individuals. The first individual has alleles 5 and 7 at the first locus and so on. Individuals 1, 2 and 3 were sampled in America and individuals 4 and 5 in Europe):

GENEPOP format:


Clustering of groups of individuals:

BAPS format:


example (data from four distinct groups)

GENEPOP format:


Trained clustering:

Must provide two data files:



example (reference data from two populations (s, r). We wish to cluster three sampling units (1unit: ind1,…). If there is no relevant information for such pre-grouping of the data to be clustered, then every individual should be one sampling unit in the input data set):


Spatial clustering:

Same as the first two above, except for the coordinate values that need to be given in a separate file:


example:


Clustering of linked molecular data (sequence data):

MLST data format:

example:

ST   Isolate     Species            Adk      GyrB     Hsp60    Mdh      Pgi      RecA
1    1A1         My.Splendidone     1        1        1        1        1        1
2    1B1         A.dent             2        2        2        2        2        2
…

For each chosen gene a corresponding FASTA file containing the aligned sequences for all included isolates is needed:

example:

>RecA-2
CTAGGGCTTTAACCC--CATTTGCAGTACTGTCATGTCAGTGTACTATTTCAC
>RecA-2
CTAGGGCTTT-ACCCT-CATTTGCAGTACTGCCATGTCACTGTACTAATTCAC


BAPS data format:


numeric data input format or a direct sequence based format:


example (“linkage map”: 3 genes the first corresponding to the columns 1-10 in the data matrix and so on. Additional zeros result in a matrix having an equal number of columns for each row):

1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 0
20 21 22 23 24 25 26 27 0 0


Admixture of individuals based on mixture clustering

Binary result file of mixture clustering

Admixture based on pre-defined clustering

BASP format:


example (First two individuals are assumed to form one cluster whose ID label is 1, individual 3 is not pre assigned to either cluster and so on):

Genepop format:

How to cite

Tang J, Hanage WP, Fraser C, Corander J. (2009). Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Computational Biology, 5(8): e1000455.
Corander, J., Waldmann, P., Marttinen, P. and Sillanpää, M.J. (2004). BAPS 2: enhanced possibilities for the analysis of genetic population structure, Bioinformatics, 20, 2363-2369.