User Tools

Site Tools


fasta

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
fasta [2007/12/17 15:49] heidifasta [2008/07/22 13:31] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== FASTA ====== ====== FASTA ======
 +**[[http://en.wikipedia.org/wiki/FASTA_format|wikipedia: FASTA format]]**\\
 +[[http://www.ncbi.nlm.nih.gov/blast/fasta.shtml|NCBI's FASTA format description]]\\
 +
 +\\
 FASTA format is a text-based format for representing either nucleic acid sequences or peptide sequences, in which base pairs or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. FASTA format is a text-based format for representing either nucleic acid sequences or peptide sequences, in which base pairs or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences.
 +
  
 ===== format information ===== ===== format information =====
   * text based   * text based
 +  * no standard file extension for a text file containing FASTA formatted sequences. FASTA format files often have file extensions like .fa, .mpfa, .fna, .fsa, .fas or .fasta
  
 ===== Data type handled ===== ===== Data type handled =====
   * nucleic acid sequences   * nucleic acid sequences
   * peptide sequences   * peptide sequences
 +
 +
 +
 +
  
  
Line 26: Line 36:
 </code> </code>
  
-\\ 
 example of a multiple sequence FASTA file: example of a multiple sequence FASTA file:
 <code> <code>
Line 50: Line 59:
     * Many different sequence databases use standardized headers, which helps when automatically extracting information from the header     * Many different sequence databases use standardized headers, which helps when automatically extracting information from the header
     * NCBI defined a standard for the unique identifier      * NCBI defined a standard for the unique identifier 
-    * they do not give a definitive description of the FASTA defline format, an attempt to create such a format: <code> +    * they do not give a definitive description of the FASTA defline format, an attempt to create such a format: 
-GenBank                           gi|gi-number|gb|accession|locus +
-EMBL Data Library                 gi|gi-number|emb|accession|locus +
-DDBJ, DNA Database of Japan       gi|gi-number|dbj|accession|locus +
-NBRF PIR                          pir||entry +
-Protein Research Foundation       prf||name +
-SWISS-PROT                        sp|accession|name +
-Brookhaven Protein Data Bank (1)  pdb|entry|chain +
-Brookhaven Protein Data Bank (2)  entry:chain|PDBID|CHAIN|SEQUENCE +
-Patents                           pat|country|number  +
-GenInfo Backbone Id               bbs|number  +
-General database identifier       gnl|database|identifier +
-NCBI Reference Sequence           ref|accession|locus +
-Local Sequence identifier         lcl|identifier +
-</code>+
  
 +| GenBank                          | ''gi|gi-number|gb|accession|locus'' |
 +| EMBL Data Library                | ''gi|gi-number|emb|accession|locus'' |
 +| DDBJ, DNA Database of Japan      | ''gi|gi-number|dbj|accession|locus'' |
 +| NBRF PIR                         | ''pir||entry'' |
 +| Protein Research Foundation      | ''prf||name'' |
 +| SWISS-PROT                       | ''sp|accession|name'' |
 +| Brookhaven Protein Data Bank (1) | ''pdb|entry|chain'' |
 +| Brookhaven Protein Data Bank (2) | ''entry:chain|PDBID|CHAIN|SEQUENCE'' |
 +| Patents                          | ''pat|country|number'' |
 +| GenInfo Backbone Id              | ''bbs|number'' |
 +| General database identifier      | ''gnl|database|identifier'' |
 +| NCBI Reference Sequence          | ''ref|accession|locus'' |
 +| Local Sequence identifier        | ''lcl|identifier'' |
 +//Anm: Die gi-Nummer ist eine Abfolge von Ziffern, die einen Datenbankeintrag des NCBI markiert.//
  
 +\\
 === Sequence representation === === Sequence representation ===
   * After the header line and comments   * After the header line and comments
Line 136: Line 146:
  
  
-===== How to cite =====+ 
 +===== converter ===== 
 +[[http://iubio.bio.indiana.edu/soft/molbio/readseq/|Readseq]] for converting sequence formats to FASTA \\ 
 +[[http://www.bugaco.com/bioinf/|Nexus to Fasta converter]]\\ 
 +[[http://gp2fasta.ovh.org/|GenBank to Fasta conventer]]
fasta.1197902954.txt.gz · Last modified: 2008/07/22 13:30 (external edit)