phylip
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
phylip [2007/12/06 12:07] – heidi | phylip [2011/07/07 12:49] (current) – heidi | ||
---|---|---|---|
Line 6: | Line 6: | ||
\\ | \\ | ||
- | Version 3.67 (July, 2007) | + | Version 3.69 (September 2009)\\ |
PHYLIP, the Phylogeny Inference Package, is a package of programs for inferring phylogenies (evolutionary trees). It can infer phylogenies by parsimony, compatibility, | PHYLIP, the Phylogeny Inference Package, is a package of programs for inferring phylogenies (evolutionary trees). It can infer phylogenies by parsimony, compatibility, | ||
Line 28: | Line 28: | ||
* discrete characters | * discrete characters | ||
* continuous characters | * continuous characters | ||
+ | |||
===== Input Files ===== | ===== Input Files ===== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== nucleotide sequences ==== | ||
+ | |||
For most of the PHYLIP programs, information comes from a series of input files, and ends up in a series of output files: | For most of the PHYLIP programs, information comes from a series of input files, and ends up in a series of output files: | ||
< | < | ||
Line 54: | Line 61: | ||
* next lines: information for each species, starting with a ten-character species name (which can include blanks and some punctuation marks. The name should be ten characters in length, filled out to the full ten characters by blanks if shorter), and continuing with the characters for that species. The name should be on the same line as the first character of the data for that species. | * next lines: information for each species, starting with a ten-character species name (which can include blanks and some punctuation marks. The name should be ten characters in length, filled out to the full ten characters by blanks if shorter), and continuing with the characters for that species. The name should be on the same line as the first character of the data for that species. | ||
* In the discrete-character programs, DNA sequence programs and protein sequence programs the characters are each a single letter or digit, sometimes separated by blanks. In the continuous-characters programs they are real numbers with decimal points, separated by blanks: < | * In the discrete-character programs, DNA sequence programs and protein sequence programs the characters are each a single letter or digit, sometimes separated by blanks. In the continuous-characters programs they are real numbers with decimal points, separated by blanks: < | ||
- | * The conventions about continuing the data beyond one line per species are different between the molecular sequence programs and the others. The molecular sequence programs can take the data in " | + | * The conventions about continuing the data beyond one line per species are different between the molecular sequence programs and the others. The molecular sequence programs can take the data in " |
- | < | + | |
6 39 | 6 39 | ||
Archaeopt CGATGCTTAC CGCCGATGCT | Archaeopt CGATGCTTAC CGCCGATGCT | ||
Line 71: | Line 77: | ||
AATCACGGCA GCCAATCAC | AATCACGGCA GCCAATCAC | ||
</ | </ | ||
- | * blanks within sequences are allowed to make them easier to read | + | |
+ | | ||
+ | * It is important that the number of sites in each group be the same for all species | ||
+ | * In the **sequential format**, the character data can run on to a new line at any time. Thus it is legal to have: < | ||
+ | 1101 | ||
+ | </ | ||
+ | Archaeopt | ||
+ | 0011001101 | ||
+ | </ | ||
- | ==== example: | + | \\ |
- | For the parsimony, compatibility and maximum likelihood programs, excluding the distance matrix methods, | + | |
- | < | + | |
+ | === example: === | ||
+ | | ||
| | ||
Archaeopt CGATGCTTAC CGC | Archaeopt CGATGCTTAC CGC | ||
Line 85: | Line 101: | ||
B.subtilisGGCAGCCAAT CAC | B.subtilisGGCAGCCAAT CAC | ||
</ | </ | ||
+ | |||
+ | * example of interleaved format: < | ||
+ | 5 42 | ||
+ | Turkey | ||
+ | Salmo gairAAGCCTTGGC AGTGCAGGGT | ||
+ | H. SapiensACCGGTTGGC CGTTCAGGGT | ||
+ | Chimp | ||
+ | Gorilla | ||
+ | |||
+ | GAGCCCGGGC AATACAGGGT AT | ||
+ | GAGCCGTGGC CGGGCACGGT AT | ||
+ | ACAGGTTGGC CGTTCAGGGT AA | ||
+ | AAACCGAGGC CGGGACACTC AT | ||
+ | AAACCATTGC CGGTACGCTT AA | ||
+ | </ | ||
+ | |||
+ | * sequential format the same sequences would be: < | ||
+ | 5 42 | ||
+ | Turkey | ||
+ | GAGCCCGGGC AATACAGGGT AT | ||
+ | Salmo gairAAGCCTTGGC AGTGCAGGGT | ||
+ | GAGCCGTGGC CGGGCACGGT AT | ||
+ | H. SapiensACCGGTTGGC CGTTCAGGGT | ||
+ | ACAGGTTGGC CGTTCAGGGT AA | ||
+ | Chimp | ||
+ | AAACCGAGGC CGGGACACTC AT | ||
+ | Gorilla | ||
+ | AAACCATTGC CGGTACGCTT AA | ||
+ | </ | ||
+ | |||
+ | ==== Distance Matrix ==== | ||
+ | * first line of the input file contains the number of species | ||
+ | * There follows species data, starting with a species name. | ||
+ | * species name is ten characters long, and must be padded out with blanks if shorter | ||
+ | * For each species there then follows a set of distances to all the other species (allow the distance matrix to be upper or lower triangular or square). The distances can continue to a new line after any of them. If the matrix is lower-triangular, | ||
+ | |||
+ | === examples: === | ||
+ | |||
+ | * sample input matrix, with a square matrix:< | ||
+ | 5 | ||
+ | Alpha 0.000 1.000 2.000 3.000 3.000 | ||
+ | Beta 1.000 0.000 2.000 3.000 3.000 | ||
+ | Gamma 2.000 2.000 0.000 3.000 3.000 | ||
+ | Delta 3.000 3.000 0.000 0.000 1.000 | ||
+ | Epsilon | ||
+ | </ | ||
+ | |||
+ | * sample lower-triangular input matrix with distances continuing to new lines as needed: < | ||
+ | 14 | ||
+ | Mouse | ||
+ | Bovine | ||
+ | Lemur | ||
+ | Tarsier | ||
+ | Squir Monk 1.5232 | ||
+ | Jpn Macaq | ||
+ | Rhesus Mac 1.9182 | ||
+ | Crab-E.Mac | ||
+ | BarbMacaq | ||
+ | Gibbon | ||
+ | 0.7858 | ||
+ | Orang | ||
+ | 0.7140 | ||
+ | Gorilla | ||
+ | 0.7966 | ||
+ | Chimp | ||
+ | 0.8288 | ||
+ | Human | ||
+ | 0.8542 | ||
+ | </ | ||
+ | |||
+ | |||
===== How to cite ===== | ===== How to cite ===== | ||
+ | Felsenstein, | ||
+ | |||
+ | \\ | ||
+ | Or if the editor for whom you are writing insists that the citation must be to a printed publication, | ||
+ | Felsenstein, | ||
+ | |||
+ |
phylip.1196939262.txt.gz · Last modified: 2008/07/22 13:30 (external edit)