GENEPOP

GENEPOP
documentation
GENEPOP input file


Genepop 4.1 (24.03.2011)
It computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci;
It computes estimates of F-statistics, null allele frequencies, allele sizebased statistics for microsatellites, etc., and of number of immigrants by Barton & Slatkin's 1986 private allele method;
It performs analyses of isolation by distance from pairwise comparisons of individuals or population samples, including confidence intervals for “neighborhood size”;
Mantel test

Program information

  • Windows
  • Linux
  • Mac OSX

Data type handled

  • haploid/diploid
  • Microsattelite
  • Standard (multi-allelic markers)

Input Files

  • First line: anything Use this line to store information about your data
  • Locus names: They may be given one per line, or on the same line but separated by commas. This could be useful to clearly label each column
Title line: "Grape populations in southern France"
                    Loc1,Loc2, ADH3,ADH4,ADH5,mtDNA
Pop
Grange des Peres , 0201 003003 0102 0302 1011 01
...
  • Pop sample indicator (Capitalization does not matter): Each sample from a different geographical original is declared by a line with a pop statement
  • Information for first individual:
    ind#001 fem ,0101 0202 0000 0410
    • Here ind#001 fem is an identifier for your personal use. You can use any character (except a comma!).The last identifier of every sub-population is used by Genepop as the sample name in output files. The comma between the identifier and the list of genotypes is required.
    • 0101 indicates that this individual is homozygous for the 01 allele at the first locus.
    • The same individual is homozygous for the 02 allele at the second locus (0202).
    • Unordered List Item Data are missing at the third locus (0000).
    • At the fourth locus, the genotype is 0410, which indicates the presence of alleles 04 and 10.
  • More individuals: Each individual information starts on a new line, but may extend over several lines (do not start a new line in the middle of a onelocus genotype!)
  • More samples: each declared by a pop statement on a new line
  • Do not leave blank lines at the end of the file


example:

Title line: "Grape populations in southern France"
ADH Locus 1
ADH #2
ADH three
ADH-4
ADH-5
mtDNA
Pop
Grange des Peres , 0201 003003 0102 0302 1011 01
Grange des Peres , 0202 003001 0102 0303 1111 01
Grange des Peres , 0102 004001 0202 0102 1010 01
Grange des Peres , 0103 002002 0101 0202 1011 01
Grange des Peres , 0203 002004 0101 0102 1010 01
POP
Tertre Roteboeuf , 0102 002002 0201 0405 0807 01
Tertre Roteboeuf , 0102 002001 0201 0405 0307 01
Tertre Roteboeuf , 0201 002003 0101 0505 0402 01
Tertre Roteboeuf , 0201 003003 0301 0303 0603 01
Tertre Roteboeuf , 0101 002001 0301 0505 0807 01
pop
Bonneau 01 , 0101 002002 0304 0805 0304 01
Bonneau 02 , 0201 002002 0404 0505 0304 01
Bonneau 03 , 0101 002100 0304 0505 0101 01
Bonneau 04 , 0101 100100 0204 0805 0304 01
Bonneau 05 , 0101 100002 0104 0808 0304 01
Pop
, 0000 002001 0202 0402 0007 01
, 0200 002001 0202 0205 0707 01
, 0010 002001 0101
0105 0807 01
last pop, 0101 002001 0101 0401 0807 02

This example shows some useful features:

  • There is no constraint on the number of blanks separating the various fields
  • The individual identifer has a free format
  • Alleles are numbered from 01 to 99 or 001 to 999 if needed. 2-digits and 3-digits coding of alleles can be intermixed (among loci, not within loci!)
  • To designate alleles, consecutive numbers are not required
  • haploid and diploid data can be intermixed. (6-digits genotypes are recognized as 3-digits diploid genotypes; 4-digits genotypes are recognized as 2-digits diploid genotypes; 2- and 3-digits genotypes are recognized as haploid genotypes. The same coding should be used consistently within each locus (for haplo-diploid data haploid data should be coded as diploid data with one unknown allele).)
  • Genotypes can extend on more than one line
  • To group various samples, just remove each relevant Pop separator

constraints:

  • Missing data should be indicated with 00 (or 000 for 3-digits coding) and not with blanks
  • The number of locus names should correspond to the number of genotypes in each individual
  • No empty line should be present in the data file
  • accepts input file names either with the extension .txt or without any extension
  • input files are ASCII text files

How to cite

Rousset F, 2008. GENEPOP'007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources, 1:103-106


Personal Tools