Differences

This shows you the differences between two versions of the page.

--- fasta [2007/12/17 15:49] – heidi
+++ fasta [2008/07/22 13:31] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
 ====== FASTA ======
+**[[http://en.wikipedia.org/wiki/FASTA_format|wikipedia: FASTA format]]**\\
+[[http://www.ncbi.nlm.nih.gov/blast/fasta.shtml|NCBI's FASTA format description]]\\
+\\
 FASTA format is a text-based format for representing either nucleic acid sequences or peptide sequences, in which base pairs or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences.
 ===== format information =====
   * text based
+  * no standard file extension for a text file containing FASTA formatted sequences. FASTA format files often have file extensions like .fa, .mpfa, .fna, .fsa, .fas or .fasta
 ===== Data type handled =====
   * nucleic acid sequences
   * peptide sequences
@@ Line 26: / Line 36: @@
 </code>
-\\
 example of a multiple sequence FASTA file:
 <code>
@@ Line 50: / Line 59: @@
     * Many different sequence databases use standardized headers, which helps when automatically extracting information from the header
     * NCBI defined a standard for the unique identifier
-    * they do not give a definitive description of the FASTA defline format, an attempt to create such a format: <code>
+    * they do not give a definitive description of the FASTA defline format, an attempt to create such a format:
-GenBank                           gi|gi-number|gb|accession|locus
-EMBL Data Library                 gi|gi-number|emb|accession|locus
-DDBJ, DNA Database of Japan       gi|gi-number|dbj|accession|locus
-NBRF PIR                          pir||entry
-Protein Research Foundation       prf||name
-SWISS-PROT                        sp|accession|name
-Brookhaven Protein Data Bank (1)  pdb|entry|chain
-Brookhaven Protein Data Bank (2)  entry:chain|PDBID|CHAIN|SEQUENCE
-Patents                           pat|country|number
-GenInfo Backbone Id               bbs|number
-General database identifier       gnl|database|identifier
-NCBI Reference Sequence           ref|accession|locus
-Local Sequence identifier         lcl|identifier
-</code>
+| GenBank                          | ''gi|gi-number|gb|accession|locus'' |
+| EMBL Data Library                | ''gi|gi-number|emb|accession|locus'' |
+| DDBJ, DNA Database of Japan      | ''gi|gi-number|dbj|accession|locus'' |
+| NBRF PIR                         | ''pir||entry'' |
+| Protein Research Foundation      | ''prf||name'' |
+| SWISS-PROT                       | ''sp|accession|name'' |
+| Brookhaven Protein Data Bank (1) | ''pdb|entry|chain'' |
+| Brookhaven Protein Data Bank (2) | ''entry:chain|PDBID|CHAIN|SEQUENCE'' |
+| Patents                          | ''pat|country|number'' |
+| GenInfo Backbone Id              | ''bbs|number'' |
+| General database identifier      | ''gnl|database|identifier'' |
+| NCBI Reference Sequence          | ''ref|accession|locus'' |
+| Local Sequence identifier        | ''lcl|identifier'' |
+//Anm: Die gi-Nummer ist eine Abfolge von Ziffern, die einen Datenbankeintrag des NCBI markiert.//
+\\
 === Sequence representation ===
   * After the header line and comments
@@ Line 136: / Line 146: @@
-===== How to cite =====
+===== converter =====
+[[http://iubio.bio.indiana.edu/soft/molbio/readseq/|Readseq]] for converting sequence formats to FASTA \\
+[[http://www.bugaco.com/bioinf/|Nexus to Fasta converter]]\\
+[[http://gp2fasta.ovh.org/|GenBank to Fasta conventer]]