GENEPOP

GENEPOP
documentation
GENEPOP input file

Genepop 4.1 (24.03.2011)
It computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci;
It computes estimates of F-statistics, null allele frequencies, allele sizebased statistics for microsatellites, etc., and of number of immigrants by Barton & Slatkin's 1986 private allele method;
It performs analyses of isolation by distance from pairwise comparisons of individuals or population samples, including confidence intervals for “neighborhood size”;
Mantel test

Program information

Windows
Linux
Mac OSX

Data type handled

haploid/diploid
Microsattelite
Standard (multi-allelic markers)

Input Files

First line: anything Use this line to store information about your data
Locus names: They may be given one per line, or on the same line but separated by commas. This could be useful to clearly label each column

Title line: "Grape populations in southern France"
                    Loc1,Loc2, ADH3,ADH4,ADH5,mtDNA
Pop
Grange des Peres , 0201 003003 0102 0302 1011 01
...

Pop sample indicator (Capitalization does not matter): Each sample from a different geographical original is declared by a line with a pop statement
Information for first individual:
```
ind#001 fem ,0101 0202 0000 0410
```
- Here ind#001 fem is an identifier for your personal use. You can use any character (except a comma!).The last identifier of every sub-population is used by Genepop as the sample name in output files. The comma between the identifier and the list of genotypes is required.
- 0101 indicates that this individual is homozygous for the 01 allele at the first locus.
- The same individual is homozygous for the 02 allele at the second locus (0202).
- Unordered List Item Data are missing at the third locus (0000).
- At the fourth locus, the genotype is 0410, which indicates the presence of alleles 04 and 10.
More individuals: Each individual information starts on a new line, but may extend over several lines (do not start a new line in the middle of a onelocus genotype!)
More samples: each declared by a pop statement on a new line
Do not leave blank lines at the end of the file

example:

Title line: "Grape populations in southern France"
ADH Locus 1
ADH #2
ADH three
ADH-4
ADH-5
mtDNA
Pop
Grange des Peres , 0201 003003 0102 0302 1011 01
Grange des Peres , 0202 003001 0102 0303 1111 01
Grange des Peres , 0102 004001 0202 0102 1010 01
Grange des Peres , 0103 002002 0101 0202 1011 01
Grange des Peres , 0203 002004 0101 0102 1010 01
POP
Tertre Roteboeuf , 0102 002002 0201 0405 0807 01
Tertre Roteboeuf , 0102 002001 0201 0405 0307 01
Tertre Roteboeuf , 0201 002003 0101 0505 0402 01
Tertre Roteboeuf , 0201 003003 0301 0303 0603 01
Tertre Roteboeuf , 0101 002001 0301 0505 0807 01
pop
Bonneau 01 , 0101 002002 0304 0805 0304 01
Bonneau 02 , 0201 002002 0404 0505 0304 01
Bonneau 03 , 0101 002100 0304 0505 0101 01
Bonneau 04 , 0101 100100 0204 0805 0304 01
Bonneau 05 , 0101 100002 0104 0808 0304 01
Pop
, 0000 002001 0202 0402 0007 01
, 0200 002001 0202 0205 0707 01
, 0010 002001 0101
0105 0807 01
last pop, 0101 002001 0101 0401 0807 02

This example shows some useful features:

There is no constraint on the number of blanks separating the various fields
The individual identifer has a free format
Alleles are numbered from 01 to 99 or 001 to 999 if needed. 2-digits and 3-digits coding of alleles can be intermixed (among loci, not within loci!)
To designate alleles, consecutive numbers are not required
haploid and diploid data can be intermixed. (6-digits genotypes are recognized as 3-digits diploid genotypes; 4-digits genotypes are recognized as 2-digits diploid genotypes; 2- and 3-digits genotypes are recognized as haploid genotypes. The same coding should be used consistently within each locus (for haplo-diploid data haploid data should be coded as diploid data with one unknown allele).)
Genotypes can extend on more than one line
To group various samples, just remove each relevant Pop separator

constraints:

Missing data should be indicated with 00 (or 000 for 3-digits coding) and not with blanks
The number of locus names should correspond to the number of genotypes in each individual
No empty line should be present in the data file
accepts input file names either with the extension .txt or without any extension
input files are ASCII text files

How to cite

Rousset F, 2008. GENEPOP'007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources, 1:103-106

Masterarbeit, Heidi Lischer

Table of Contents