14.12.2007
meeting at 14.12.07
- design of a new data format to store population genetics data:
- XML or XML like
- not to complex that you can have a look on your data in the text format
- e.g.:
- header: Name, Population, number of loci, history of loci (IDs), (loci information), no of individuals/sequences, interleaved data,…
- data: in a table like form with IDs, Frequency, Data (<ID>…</ID> <Freq>…</Freq> <Genot>…</Genot>)
- also look at different databases/formats:
- ENSEMBL (in FASTA format)
- HapMap: HapMap:ENCODE Data
- HGDP
- ENCODE (resequencing HapMap)
- SOLEXA: information on how solexa data are handled and concatenated
- FASTA (It seems also that the FASTA format is pretty well spread and it would be worth having a look at)
- R-lequin:
- give information about how to structure and set XML tags in the output of Arlequin (good visualisation and data extraction to make graphics)
- to do until January
14.12.2007.txt · Last modified: 2008/07/22 13:31 by 127.0.0.1