User Tools

Site Tools


fastq

This is an old revision of the document!


FASTQ

wikipedia: FASTQ format
http://nar.oxfordjournals.org/content/early/2009/12/16/nar.gkp1137.full


FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are encoded with a single ASCII character for brevity. It was originally developed at the Wellcome Trust Sanger Institute to bundle a FASTA sequence and its quality data, but has recently become the de facto standard for storing the output of high throughput sequencing instruments such as the Illumina Genome Analyzer.

Format information

  • text based
  • no standard file extension

Format

  • A FASTQ file normally uses four lines per sequence. Line 1 begins with a '@' character and is followed by a sequence identifier and an optional description (like a FASTA title line). Line 2 is the raw sequence letters. Line 3 begins with a '+' character and is optionally followed by the same sequence identifier (and any description) again. Line 4 encodes the quality values for the sequence in Line 2, and must contain the same number of symbols as letters in the sequence.

file format


fastq.1315227994.txt.gz · Last modified: 2011/09/05 15:06 (external edit)