This is an old revision of the document!

MIGRATE

Version 3.0 (1. August 2008)
Migrate estimates population parameters, effective population sizes and migration rates of n populations, using genetic data. It uses a coalescent theory approach taking into account history of mutations and uncertainty of the genealogy. The estimates of the parameter values are achieved by either a Maximum likelihood (ML-approach or Bayesian inference (BI)).

Program information

MacOS X
Linux
Sun Solaris
Windows

Data type handled

DNA sequence
SNP
Microsatellite
Standard (Electrophoretic marker)

Input Files

Syntax:

< token >: the token is obligatory
[token]: optional
{token}: obligatory for some
< token1|token2 >: choose one of the token kind of data
<individual1 10-10>: means that this token needs to be 10 characters long
The characters for any word token can normally include special characters, punctuation, and blanks (e.g.:Ind1 02 @ is legal)

enzyme electrophoretic data or microsatellite data would look like this:

<Number of populations> <number of loci> {delimiter between alleles} [project title 0-79]
<Number of individuals> <title for population 0-79>
<Individual 1 10-10> <data>
<Individual 2 10-10> <data>
....
<Number of individuals> <title for population 0-79>
<Individuum 1 10-10> <data>
<Individuum 2 10-10> <data>
....

the delimiter is needed for microsatellite data
the project title is optional
the individual name has to be by default 10 characters

sequences or SNPs

non-interleaved data:

<Number of populations> <number of loci> [project title 0-79]
<number of sites for locus1> <number of sites for locus 2> ...
<Number of individuals locus1> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
<Individuum 1 10-10> <data locus 1>
<Individuum 2 10-10> <data locus 1>
....
<Individuum 1 10-10> <data locus 2>
<Individuum 2 10-10> <data locus 2>
....
<Number of individuals> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
<Individuum 1 10-10> <data locus 1>
<Individuum 2 10-10> <data locus 1>
....
<Individuum 1 10-10> <data locus 2>
<Individuum 2 10-10> <data locus 2>
....

interleaved data:

<Number of populations> <number of loci> [project title 0-79]
<number of sites for locus1> <number of sites for locus 2> ...
<Number of individuals locus 1> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
<Individual 1 10-10> <data locus 1 part 1>
<Individuum 2 10-10> <data locus 1 part 1>
....
<data ind1 locus 1 part 2>
<data ind2 locus 1 part 2>
....
<Individual 1 10-10> <data locus 2>
<Individual 2 10-10> <data locus 2>
....
<data ind1 locus 2 part 2>
<data ind2 locus 2 part 2>
....

Enzyme electrophoretic data (infinite allele model)

Genotypes
missing data: “?”
can use multi-character coding when you use a delimiter (see the examples for microsatellites)

example (2 populations and 11 loci and with 3 or 2 individuals per population:

 2 11 Migration rates between two Turkish frog populations
3 Akcapinar (between Marmaris and Adana)
PB1058    ee bb ab bb bb aa aa bb ?? cc aa
PB1059    ee bb ab bb bb aa aa bb bb cc aa
PB1060    ee bb b? bb ab aa aa bb bb cc aa
2 Ezine (between Selcuk and Dardanelles)
PB16843   ee bb ab bb aa aa aa cc bb cc aa
PB16844   ee bb bb bb ab aa aa cc bb cc aa

Microsatellite data

The third argument on the first line has to be a delimiter character (e.g: “.”)
Genotypes
Each individual has two alleles
Alleles are coded as REPEAT NUMBERS
homozygote individual: needs to be coded as e.g.: 6.6 (“.” is the delimiter)
missing data: “?’”

example:

 2 3 . Rana lessonae: Seeruecken versus Tal
2   Riedtli near G\"undelhart-H\"orhausen
0         42.45 37.31 18.18
0         42.45 37.33 18.16
4   Tal near Steckborn
1         43.46 33.37 18.18
1         44.46 33.35 19.18
1         44.46 35.? 18.18
1         43.42 35.31 20.18

Sequence data

After the individual name follows the base sequence of that species
each character being one of the letters A, B, C, D, G, H, K, M, N, O, R, S, T, U, V, W, X, Y, ?, or -
Blanks will be ignored (this allows GENEBANK and EMBL sequence entries to be read with minimum editing)
characters can be either upper or lower case
characters constitute the IUPAC (IUB) nucleic acid code plus some slight extensions:

Symbol	Meaning
A	Adenine
G	Guanine
C	Cytosine
T	Thymine
U	Uracil
Y	pYrimidine (C or T)
R	puRine (A or G)
W	”Weak” (A or T)
S	”Strong” (C or G)
K	”Keto” (T or G)
M	”aMino” (C or A)
B	not A (C or G or T)
D	not C (A or G or T)
H	not G (A or C or T)
V	not T (A or C or G)
X,N,?	unknown (A or C or G or T)
O	deletion
-	deletion

examples:

not interleaved (2 population with 2 loci):

   2 2 Make believe data set using simulated data (2 loci)
50 46
3 3   pop1
eis       ACACCCAACACGGCCCGCGGACAGGGGCTCGAGGGATCACTGACTGGCAC
zwo       ACACAAAACACGGCCCGCGGACAGGGGCTCGAGGGGTCACTGAGTGGCAC
drue      ATACCCAGCACGGCCGGCGGACAGGGGCTCGAGGGAGCACTGAGTGGAAC
eis       ACGCGGCGCGCGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
zwo       ACGCGGCGCGAGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
drue      ACGCGGCGCGAGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
2   pop2
vier      CAGCGCGCGTATCGCCCCATGTGGTTCGGCCAAAGAATGGTAGAGCGGAG
fuef      CAGCGCGAGTCTCGCCCCATGGGGTTAGGCCAAATAATGTTAGAGCGGCA
vier      TCGACTAGATCTGCAGCACATACGAGGGTCATGCGTCCCAGATGTG
fuefLoc2  TCGACTAGATATGCAGCAAATACGAGGGGCATGCGTCCCAGATGTG

interleaved (2 populations with 2 loci):

   2 2 Make believe data set using simulated data (2 loci, interleaved)
50 46
3 2   pop1
eis       ACACCCAACACGGCCCGCGGACA
zwo       ACACAAAACACGGCCCGCGGACA
drue      ATACCCAGCACGGCCGGCGGACA
          GGGGCTCGAGGGATCACTGACTGGCAC
          GGGGCTCGAGGGGTCACTGAGTGGCAC
          GGGGCTCGAGGGAGCACTGAGTGGAAC
eis       ACGCGGCGCGCGAACGAAGACCA
zwo       ACGCGGCGCGAGAACGAAGACCA
          AATCTTCTTGATCCCCAAGTGTC
          AATCTTCTTGATCCCCAAGTGTC
2 2 pop2
vier      CAGCGCGCGTATCGCCCCATGTGGTTCGGCCAAAGAATG
fuef      CAGCGCGAGTCTCGCCCCATGGGGTTAGGCCAAATAATG
          GTAGAGCGGAG
  TTAGAGCGGCA
          TCGACTAGATCTG CAGCACATAC
          TCGACTAGATATG CAGCAAATAC
  GAGGGTCATGCGTCCCAGATGTG
  GAGGGGCATGCGTCCCAGATGTG

How to cite

Please cite the 1999 and the 2001 papers and one of the others:

Beerli, P. 1997-2004. Migrate: documentation and program, part of LAMARC. Version 2.0. Revised December 23, 2004. Distributed over the Internet, http://evolution.gs.washington.edu/lamarc.html [Downloaded: …date….]
Beerli, P., and J. Felsenstein. 2001. PNAS.
Beerli, P., and J. Felsenstein. 1999. Maximum likelihood estimation of migration rates and population numbers of two populations using a coalescent approach. Genetics 152(2): 763-773.
Beerli, P. 1998. Estimation of migration rates and population sizes in geographically structured populations. In: Advances in molecular ecology (Ed. G. Carvalho). NATO-ASI workshop series. IOS Press, Amsterdam. Pp. 39-53.

Masterarbeit, Heidi Lischer

Table of Contents