im
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
im [2007/12/06 10:05] – heidi | im [2011/07/14 14:26] (current) – heidi | ||
---|---|---|---|
Line 2: | Line 2: | ||
{{im.jpg? | {{im.jpg? | ||
- | **[[http://lifesci.rutgers.edu/ | + | **[[http://genfaculty.rutgers.edu/ |
- | [[http:// | + | [[http:// |
\\ | \\ | ||
- | updated | + | updated |
IM is a program for the fitting of an isolation model with migration to haplotype data drawn from two closely related species or populations. Large numbers of loci can be studied simultaneously, | IM is a program for the fitting of an isolation model with migration to haplotype data drawn from two closely related species or populations. Large numbers of loci can be studied simultaneously, | ||
IMa also allows log likelihood ratio tests of nested demographic models. IMa is faster and better than IM (i.e. by virtue of providing access to the joint posterior density function), and it can be used for most (but not all) of the situations and options that IM can be used for | IMa also allows log likelihood ratio tests of nested demographic models. IMa is faster and better than IM (i.e. by virtue of providing access to the joint posterior density function), and it can be used for most (but not all) of the situations and options that IM can be used for | ||
+ | |||
===== Program information ===== | ===== Program information ===== | ||
- | * written in C | + | * written in C++ |
* unix/linux | * unix/linux | ||
* dos/windows | * dos/windows | ||
* Mac | * Mac | ||
+ | |||
+ | |||
+ | |||
===== Data type handled ===== | ===== Data type handled ===== | ||
+ | * DNA | ||
+ | * Microsatellites (STR) | ||
+ | |||
+ | |||
+ | |||
Line 40: | Line 49: | ||
- The mutation rate per year for the locus (not per base pair). This can be left blank, but is needed for estimating parameters on demographic scales. If there are multiple STRs in the locus then there can be multiple mutation rates on this line separated by spaces. If the locus is a HapSTR, then the first mutation rate given applies to the sequence portion of the locus with subsequent values corresponding to STR markers included in the locus. | - The mutation rate per year for the locus (not per base pair). This can be left blank, but is needed for estimating parameters on demographic scales. If there are multiple STRs in the locus then there can be multiple mutation rates on this line separated by spaces. If the locus is a HapSTR, then the first mutation rate given applies to the sequence portion of the locus with subsequent values corresponding to STR markers included in the locus. | ||
- If the mutation rate is given, it can be followed by a range of mutation rates that can be used (with ranges for other loci in the analysis) to set priors on the ratios of mutation rate scalars. The range is entered with an open parentheses, | - If the mutation rate is given, it can be followed by a range of mutation rates that can be used (with ranges for other loci in the analysis) to set priors on the ratios of mutation rate scalars. The range is entered with an open parentheses, | ||
- | * line 5 - data for gene copy # 1 from population 1. The first 10 spaces are devoted to the sample name. The sequence or allele length (for SSM model) begins in column 11 of the file. The sequence for a given sample is given all on one line without gaps. For SSM or HapSTR data, the allele length assumes a step size of 1. This means that data from STRs that are multiples of lengths greater than 1 must be converted to counts of the number of base repeats (e.g. for a dinucleotide | + | * line 5 - data for gene copy # 1 from population 1. The first 10 spaces are devoted to the sample name. The sequence or allele length (for SSM model) begins in column 11 of the file. The sequence for a given sample is given all on one line without gaps. For SSM or HapSTR data, the allele length assumes a step size of 1. This means that data from STRs that are multiples of lengths greater than 1 must be converted to counts of the number of base repeats (e.g. for a dinucloitide |
+ | * lines 6 thru line (n1+n2 +4) - the remainder of the data for locus 1. Each line contains the data for one sample. The data for population 1 samples are given in lines 5 thru line (n1 + 4). The data for population 2 begins on line (n1+5) and proceeds to line (n1+n2+4) | ||
+ | * Additional lines for additional loci - If there is more than one locus, then the data for locus 2 begins on line (n1+n2+5) with a line similar to line 4 presenting the basic information for locus 2. The sample names and sample sizes for locus 2 and the inheritance scalars and mutation model for locus 2 need not be the same as for locus 1 | ||
+ | * last line - should end with a newline so that the file ends on a blank line | ||
+ | |||
+ | |||
+ | |||
+ | ==== example: ==== | ||
+ | example for a tiny three locus data set. The mutation rate per year is known and specified for locus 1, but not for loci 2 and 3 | ||
+ | < | ||
+ | Example data for IM | ||
+ | # im test data | ||
+ | population1 population2 | ||
+ | 3 | ||
+ | locus1 1 1 13 I 1 0.0000000008 (0.0000000001, | ||
+ | pop1_1 | ||
+ | pop2_1 | ||
+ | hapstrexample 2 1 4 J2 0.75 | ||
+ | pop1_1 | ||
+ | pop1_2 | ||
+ | pop2_1 | ||
+ | strexample 2 2 1 S1 1 0.00001 (0.000001, 0.00005) | ||
+ | strpop11a 23 | ||
+ | strpop11b 26 | ||
+ | strpop21a 25 | ||
+ | strpop21b 31 | ||
+ | </ | ||
===== How to cite ===== | ===== How to cite ===== | ||
+ | * Nielsen R. and Wakeley J. (2001). Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics, 158(2): | ||
+ | * Hey J. and Nielsen R. (2007). Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population genetics. Proc Natl Acad Sci USA, 104(8): |
im.1196931931.txt.gz · Last modified: 2008/07/22 13:30 (external edit)