User Tools

Site Tools


migrate

This is an old revision of the document!


MIGRATE


MIGRATE
documentation


Version 3.0 (1. August 2008)
Migrate estimates population parameters, effective population sizes and migration rates of n populations, using genetic data. It uses a coalescent theory approach taking into account history of mutations and uncertainty of the genealogy. The estimates of the parameter values are achieved by either a Maximum likelihood (ML-approach or Bayesian inference (BI)).

Program information

  • MacOS X
  • Linux
  • Sun Solaris
  • Windows

Data type handled

  • DNA sequence
  • SNP
  • Microsatellite
  • Standard (Electrophoretic marker)

Input Files

Syntax:

  • < token >: the token is obligatory
  • [token]: optional
  • {token}: obligatory for some
  • < token1|token2 >: choose one of the token kind of data
  • <individual1 10-10>: means that this token needs to be 10 characters long
  • The characters for any word token can normally include special characters, punctuation, and blanks (e.g.:Ind1 02 @ is legal)


enzyme electrophoretic data or microsatellite data would look like this:

<Number of populations> <number of loci> {delimiter between alleles} [project title 0-79]
<Number of individuals> <title for population 0-79>
<Individual 1 10-10> <data>
<Individual 2 10-10> <data>
....
<Number of individuals> <title for population 0-79>
<Individuum 1 10-10> <data>
<Individuum 2 10-10> <data>
....
  • the delimiter is needed for microsatellite data
  • the project title is optional
  • the individual name has to be by default 10 characters


sequences or SNPs

  • non-interleaved data:
    <Number of populations> <number of loci> [project title 0-79]
    <number of sites for locus1> <number of sites for locus 2> ...
    <Number of individuals locus1> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
    <Individuum 1 10-10> <data locus 1>
    <Individuum 2 10-10> <data locus 1>
    ....
    <Individuum 1 10-10> <data locus 2>
    <Individuum 2 10-10> <data locus 2>
    ....
    <Number of individuals> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
    <Individuum 1 10-10> <data locus 1>
    <Individuum 2 10-10> <data locus 1>
    ....
    <Individuum 1 10-10> <data locus 2>
    <Individuum 2 10-10> <data locus 2>
    ....
  • interleaved data:
    <Number of populations> <number of loci> [project title 0-79]
    <number of sites for locus1> <number of sites for locus 2> ...
    <Number of individuals locus 1> <#ind locus 2> ... <#ind loc n> <title for population 0-79>
    <Individual 1 10-10> <data locus 1 part 1>
    <Individuum 2 10-10> <data locus 1 part 1>
    ....
    <data ind1 locus 1 part 2>
    <data ind2 locus 1 part 2>
    ....
    <Individual 1 10-10> <data locus 2>
    <Individual 2 10-10> <data locus 2>
    ....
    <data ind1 locus 2 part 2>
    <data ind2 locus 2 part 2>
    ....

Enzyme electrophoretic data (infinite allele model)

  • Genotypes
  • missing data: “?”
  • can use multi-character coding when you use a delimiter (see the examples for microsatellites)

example (2 populations and 11 loci and with 3 or 2 individuals per population:

 2 11 Migration rates between two Turkish frog populations
3 Akcapinar (between Marmaris and Adana)
PB1058    ee bb ab bb bb aa aa bb ?? cc aa
PB1059    ee bb ab bb bb aa aa bb bb cc aa
PB1060    ee bb b? bb ab aa aa bb bb cc aa
2 Ezine (between Selcuk and Dardanelles)
PB16843   ee bb ab bb aa aa aa cc bb cc aa
PB16844   ee bb bb bb ab aa aa cc bb cc aa

Microsatellite data

  • The third argument on the first line has to be a delimiter character (e.g: “.”)
  • Genotypes
  • Each individual has two alleles
  • Alleles are coded as REPEAT NUMBERS
  • homozygote individual: needs to be coded as e.g.: 6.6 (“.” is the delimiter)
  • missing data: “?’”

example:

 2 3 . Rana lessonae: Seeruecken versus Tal
2   Riedtli near G\"undelhart-H\"orhausen
0         42.45 37.31 18.18
0         42.45 37.33 18.16
4   Tal near Steckborn
1         43.46 33.37 18.18
1         44.46 33.35 19.18
1         44.46 35.? 18.18
1         43.42 35.31 20.18

Sequence data

  • After the individual name follows the base sequence of that species
  • each character being one of the letters A, B, C, D, G, H, K, M, N, O, R, S, T, U, V, W, X, Y, ?, or -
  • Blanks will be ignored (this allows GENEBANK and EMBL sequence entries to be read with minimum editing)
  • characters can be either upper or lower case
  • characters constitute the IUPAC (IUB) nucleic acid code plus some slight extensions:
Symbol Meaning
A Adenine
G Guanine
C Cytosine
T Thymine
U Uracil
Y pYrimidine (C or T)
R puRine (A or G)
W ”Weak” (A or T)
S ”Strong” (C or G)
K ”Keto” (T or G)
M ”aMino” (C or A)
B not A (C or G or T)
D not C (A or G or T)
H not G (A or C or T)
V not T (A or C or G)
X,N,? unknown (A or C or G or T)
O deletion
- deletion

examples:

  • not interleaved (2 population with 2 loci):
       2 2 Make believe data set using simulated data (2 loci)
    50 46
    3 3   pop1
    eis       ACACCCAACACGGCCCGCGGACAGGGGCTCGAGGGATCACTGACTGGCAC
    zwo       ACACAAAACACGGCCCGCGGACAGGGGCTCGAGGGGTCACTGAGTGGCAC
    drue      ATACCCAGCACGGCCGGCGGACAGGGGCTCGAGGGAGCACTGAGTGGAAC
    eis       ACGCGGCGCGCGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
    zwo       ACGCGGCGCGAGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
    drue      ACGCGGCGCGAGAACGAAGACCAAATCTTCTTGATCCCCAAGTGTC
    2   pop2
    vier      CAGCGCGCGTATCGCCCCATGTGGTTCGGCCAAAGAATGGTAGAGCGGAG
    fuef      CAGCGCGAGTCTCGCCCCATGGGGTTAGGCCAAATAATGTTAGAGCGGCA
    vier      TCGACTAGATCTGCAGCACATACGAGGGTCATGCGTCCCAGATGTG
    fuefLoc2  TCGACTAGATATGCAGCAAATACGAGGGGCATGCGTCCCAGATGTG
  • interleaved (2 populations with 2 loci):
       2 2 Make believe data set using simulated data (2 loci, interleaved)
    50 46
    3 2   pop1
    eis       ACACCCAACACGGCCCGCGGACA
    zwo       ACACAAAACACGGCCCGCGGACA
    drue      ATACCCAGCACGGCCGGCGGACA
              GGGGCTCGAGGGATCACTGACTGGCAC
              GGGGCTCGAGGGGTCACTGAGTGGCAC
              GGGGCTCGAGGGAGCACTGAGTGGAAC
    eis       ACGCGGCGCGCGAACGAAGACCA
    zwo       ACGCGGCGCGAGAACGAAGACCA
              AATCTTCTTGATCCCCAAGTGTC
              AATCTTCTTGATCCCCAAGTGTC
    2 2 pop2
    vier      CAGCGCGCGTATCGCCCCATGTGGTTCGGCCAAAGAATG
    fuef      CAGCGCGAGTCTCGCCCCATGGGGTTAGGCCAAATAATG
              GTAGAGCGGAG
      TTAGAGCGGCA
              TCGACTAGATCTG CAGCACATAC
              TCGACTAGATATG CAGCAAATAC
      GAGGGTCATGCGTCCCAGATGTG
      GAGGGGCATGCGTCCCAGATGTG

How to cite

Please cite the 1999 and the 2001 papers and one of the others:

  • Beerli, P. 1997-2004. Migrate: documentation and program, part of LAMARC. Version 2.0. Revised December 23, 2004. Distributed over the Internet, http://evolution.gs.washington.edu/lamarc.html [Downloaded: …date….]
  • Beerli, P., and J. Felsenstein. 2001. PNAS.
  • Beerli, P., and J. Felsenstein. 1999. Maximum likelihood estimation of migration rates and population numbers of two populations using a coalescent approach. Genetics 152(2): 763-773.
  • Beerli, P. 1998. Estimation of migration rates and population sizes in geographically structured populations. In: Advances in molecular ecology (Ed. G. Carvalho). NATO-ASI workshop series. IOS Press, Amsterdam. Pp. 39-53.
migrate.1224678494.txt.gz · Last modified: 2008/10/22 14:28 by heidi