mega
                Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| mega [2008/05/19 17:00] – heidi | mega [2011/07/07 11:50] (current) – heidi | ||
|---|---|---|---|
| Line 8: | Line 8: | ||
| \\ | \\ | ||
| - | Version | + | Version | 
| MEGA is an integrated tool for conducting automatic and manual sequence alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing evolutionary hypotheses. | MEGA is an integrated tool for conducting automatic and manual sequence alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing evolutionary hypotheses. | ||
| + | |||
| Line 15: | Line 16: | ||
| ===== Program information ===== | ===== Program information ===== | ||
| - | * Windows | + | * Windows XP, Vista, 7 (with at least 64 MB of RAM, 20 MB of available hard disk space) | 
| * MEGA also can be run on other operating systems for which Windows emulators are available: | * MEGA also can be run on other operating systems for which Windows emulators are available: | ||
| * Macintosh: Windows using VirtualPC | * Macintosh: Windows using VirtualPC | ||
| * Sun Workstation: | * Sun Workstation: | ||
| * Linux: Windows using VMWare | * Linux: Windows using VMWare | ||
| + | |||
| Line 27: | Line 29: | ||
| * RNA | * RNA | ||
| * nucleotide | * nucleotide | ||
| + | * distance | ||
| * (protein sequences) | * (protein sequences) | ||
| + | |||
| ===== Input Files ===== | ===== Input Files ===== | ||
| * ASCII-text files | * ASCII-text files | ||
| * extension: *.MEG | * extension: *.MEG | ||
| + | * Importing Data from Other Formats: | ||
| + | * CLUSTAL | ||
| + | * [[NEXUS]] | ||
| + | * [[PHYLIP]] (Interleaved/ | ||
| + | * GCG | ||
| + | * [[FASTA]] | ||
| + | * PIR | ||
| + | * NBRF | ||
| + | * MSF | ||
| + | * IG | ||
| + | * Internet (NCBI) XML format | ||
| + | |||
| + | |||
| Line 56: | Line 73: | ||
| * written after the Title or the Description statement | * written after the Title or the Description statement | ||
| * contains one or more command statements | * contains one or more command statements | ||
| - | * A command statement contains a command and a valid setting keyword ('' | + | * A command statement contains a command and a valid setting keyword ('' | 
| \\ | \\ | ||
| - | * Comments: | + | * Comments: | 
| - | * keywords: | + | * anywhere in the data file | 
| - | * Rules for Taxa Names: Distance matrices as well as sequence data may come from species, populations, | + | * can span multiple lines | 
| - | * ‘#’ Sign: Every Iabel must be written on a new line, and a '#' | + | * enclosed in square brackets ([and]) | 
| - | * Characters: Taxa labels | + | * can be nested | 
| + | * keywords: | ||
| + | * written in any combination of lower- and upper-case letters | ||
| + | * Taxa Names: | ||
| + | * ‘#’ Sign: Every Iabel must be written on a new line, and a '#' | ||
| + | * no restrictions on the length of the Iabels | ||
| + | * not required to be unique | ||
| + | * must start with alphanumeric characters (0-9, a-z, and A-Z) or a special character: '' | ||
| + | * After the first character, taxa labels may contain the following additional special characters:'' | ||
| + | * For multiple word labels, an underscore can be used to represent a blank space | ||
| + | |||
| + | \\ | ||
| + | |||
| + | |||
| ==== Sequence Input Data ==== | ==== Sequence Input Data ==== | ||
| Line 86: | Line 117: | ||
| | MatchChar | single character | Synonymous with the identical keyword | MatchChar = . | | | MatchChar | single character | Synonymous with the identical keyword | MatchChar = . | | ||
| | Missing | single character | use question mark (?) to indicate missing data | Missing = ? | | | Missing | single character | use question mark (?) to indicate missing data | Missing = ? | | ||
| + | | CodeTable | A name | This instruction gives the name of the code table for the protein coding domains of the data | CodeTable = Standard | | ||
| + | |||
| * **Defining Genes and Domains:** | * **Defining Genes and Domains:** | ||
| Line 96: | Line 129: | ||
| | CodonStart | A number | specifies the site where the next 1st-codon position will be found in a protein-coding domain | CodonStart=2 | | | CodonStart | A number | specifies the site where the next 1st-codon position will be found in a protein-coding domain | CodonStart=2 | | ||
| - |  | + |  | 
| + | * assign different taxa to groups in a sequence as well as to distance data files. | ||
| + | * the name of the group is written in a set of curly brackets ({}) following the taxa name. The group name can be attached to the taxa name using an underscore or just can be appended. | ||
| + | * there should be no spaces between the taxa name and group name | ||
| + | |||
| + | * **Labelling Individual Sites:** | ||
| + | * The individual sites in nucleotide or amino acid data can be labeled to construct non-contiguous sets of sites. | ||
| + | * Each site can be associated with only one label | ||
| + | * A label can be a letter or a number. | ||
| + | |||
| + | === example === | ||
| + | < | ||
| !Gene=FirstGene Domain=Exon1 Property=Coding; | !Gene=FirstGene Domain=Exon1 Property=Coding; | ||
| # | # | ||
| Line 111: | Line 155: | ||
| #Mouse ATCTGATCTCGTGTGCTGGTACGAATGATTTCTGCGTTCAACTGA | #Mouse ATCTGATCTCGTGTGCTGGTACGAATGATTTCTGCGTTCAACTGA | ||
| #Chicken ATCTGCTCTCGAGTACTGCTACCAATGACTTCTGCGTACAACTGA | #Chicken ATCTGCTCTCGAGTACTGCTACCAATGACTTCTGCGTACAACTGA | ||
| + | !Label +++__-+++-a-+++-L-+++-k-+++123+++-_-+++---+++; | ||
| </ | </ | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ==== Distance Input Data ==== | ||
| + | * in the lower-left or in the upper-right triangular matrix | ||
| + | * After writing the # | ||
| + | * Taxa names are followed by the distance matrix | ||
| + | |||
| + | \\ | ||
| + | * **Keywords for Format Statement: | ||
| + | |||
| + | ^ Command ^ Setting ^ Remark ^ Example ^ | ||
| + | | DataType | Distance | Specifies that the distance data is in the file | DataType=distance | | ||
| + | | NSeqs | integer | Number of sequences | NSeqs=85 | | ||
| + | | NTaxa | integer | Same as NSeqs | NTaxa=85 | | ||
| + | | DataFormat | Lowerleft, upperright | Specifies whether the data is in lower left triangular matrix or the upper right triangular matrix | DataFormat=lowerleft | | ||
| + | |||
| + | |||
| + | * **Defining Groups:** | ||
| + | * see above | ||
| === example === | === example === | ||
| + | < | ||
| + | #mega | ||
| + | !Title: Concatenated Files; | ||
| + | !Format DataType=Distance DataFormat=LowerLeft NTaxa=6; | ||
| + | |||
| + | #Rodent | ||
| + | #Primate | ||
| + | #Lagomorpha | ||
| + | # | ||
| + | #Carnivora | ||
| + | # | ||
| + |  | ||
| + | 0.514 | ||
| + | 0.535 0.436 | ||
| + | 0.530 0.388 0.418 | ||
| + | 0.521 0.353 0.417 0.345 | ||
| + | 0.500 0.331 0.402 0.327 0.349 | ||
| + | </ | ||
| + | |||
| + | |||
| ===== How to cite ===== | ===== How to cite ===== | ||
| - | * When referring to MEGA in the main text of your publication, | + | Citation for MEGA 5: | 
| - | Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 4 (Tamura, Dudley, Nei, and Kumar 2007). | + | * Tamura K, Peterson D, Peterson N, Stecher G, Nei M, and Kumar S (2011) MEGA5: Molecular Evolutionary Genetics Analysis | 
| - | * When including a MEGA citation in the Literature Cited/ | + | |
| - | Tamura K, Dudley J, Nei M & Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis | + | |
| + | \\ | ||
| + | Citation for MEGA 4: | ||
| + | * Tamura K, Dudley J, Nei M & Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24: 1596-1599. | ||
mega.1211209239.txt.gz · Last modified: 2008/07/22 13:30 (external edit)
                
                