Table of Contents
GENEPOP
GENEPOP
documentation
GENEPOP input file
Genepop 4.1 (24.03.2011)
It computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci;
It computes estimates of F-statistics, null allele frequencies, allele sizebased statistics for microsatellites, etc., and of number of immigrants by Barton & Slatkin's 1986 private allele method;
It performs analyses of isolation by distance from pairwise comparisons of individuals or population samples, including confidence intervals for “neighborhood size”;
Mantel test
Program information
- Windows
- Linux
- Mac OSX
Data type handled
- haploid/diploid
- Microsattelite
- Standard (multi-allelic markers)
Input Files
- First line: anything Use this line to store information about your data
- Locus names: They may be given one per line, or on the same line but separated by commas. This could be useful to clearly label each column
Title line: "Grape populations in southern France" Loc1,Loc2, ADH3,ADH4,ADH5,mtDNA Pop Grange des Peres , 0201 003003 0102 0302 1011 01 ...
- Pop sample indicator (Capitalization does not matter): Each sample from a different geographical original is declared by a line with a pop statement
- Information for first individual:
ind#001 fem ,0101 0202 0000 0410
- Here ind#001 fem is an identifier for your personal use. You can use any character (except a comma!).The last identifier of every sub-population is used by Genepop as the sample name in output files. The comma between the identifier and the list of genotypes is required.
- 0101 indicates that this individual is homozygous for the 01 allele at the first locus.
- The same individual is homozygous for the 02 allele at the second locus (0202).
- Unordered List Item Data are missing at the third locus (0000).
- At the fourth locus, the genotype is 0410, which indicates the presence of alleles 04 and 10.
- More individuals: Each individual information starts on a new line, but may extend over several lines (do not start a new line in the middle of a onelocus genotype!)
- More samples: each declared by a pop statement on a new line
- Do not leave blank lines at the end of the file
example:
Title line: "Grape populations in southern France" ADH Locus 1 ADH #2 ADH three ADH-4 ADH-5 mtDNA Pop Grange des Peres , 0201 003003 0102 0302 1011 01 Grange des Peres , 0202 003001 0102 0303 1111 01 Grange des Peres , 0102 004001 0202 0102 1010 01 Grange des Peres , 0103 002002 0101 0202 1011 01 Grange des Peres , 0203 002004 0101 0102 1010 01 POP Tertre Roteboeuf , 0102 002002 0201 0405 0807 01 Tertre Roteboeuf , 0102 002001 0201 0405 0307 01 Tertre Roteboeuf , 0201 002003 0101 0505 0402 01 Tertre Roteboeuf , 0201 003003 0301 0303 0603 01 Tertre Roteboeuf , 0101 002001 0301 0505 0807 01 pop Bonneau 01 , 0101 002002 0304 0805 0304 01 Bonneau 02 , 0201 002002 0404 0505 0304 01 Bonneau 03 , 0101 002100 0304 0505 0101 01 Bonneau 04 , 0101 100100 0204 0805 0304 01 Bonneau 05 , 0101 100002 0104 0808 0304 01 Pop , 0000 002001 0202 0402 0007 01 , 0200 002001 0202 0205 0707 01 , 0010 002001 0101 0105 0807 01 last pop, 0101 002001 0101 0401 0807 02
This example shows some useful features:
- There is no constraint on the number of blanks separating the various fields
- The individual identifer has a free format
- Alleles are numbered from 01 to 99 or 001 to 999 if needed. 2-digits and 3-digits coding of alleles can be intermixed (among loci, not within loci!)
- To designate alleles, consecutive numbers are not required
- haploid and diploid data can be intermixed. (6-digits genotypes are recognized as 3-digits diploid genotypes; 4-digits genotypes are recognized as 2-digits diploid genotypes; 2- and 3-digits genotypes are recognized as haploid genotypes. The same coding should be used consistently within each locus (for haplo-diploid data haploid data should be coded as diploid data with one unknown allele).)
- Genotypes can extend on more than one line
- To group various samples, just remove each relevant Pop separator
constraints:
- Missing data should be indicated with 00 (or 000 for 3-digits coding) and not with blanks
- The number of locus names should correspond to the number of genotypes in each individual
- No empty line should be present in the data file
- accepts input file names either with the extension .txt or without any extension
- input files are ASCII text files
How to cite
Rousset F, 2008. GENEPOP'007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources, 1:103-106