ped
This is an old revision of the document!
Table of Contents
PED
PED
The “ped” file format refers to the widely-used format for linkage pedigree data and used as input for the program PLINK. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
Program information
- written in C/C++
- Mac
- Windows
- Unix
Data type handled
- diploid
- AFLP
- MICROSAT
- Standard
Input Files
- whitespace (spaces and or tabs) separated text file *.ped
- each line correspond to one individual
- following first 6 columns are mandatory (The IDs are alphanumberic):
Family ID
Individual ID
Paternal ID
Maternal ID
Sex
(1=male; 2=female; any other character=unknown)Phenotype
(only 1 phenotype! The phenotype can be either a quantitative trait or an affection status column: PLINK will automatically detect which type (i.e. based on whether a value other than 0, 1, 2 or the missing genotype code is observed))
- Comments: line starts with
#
- Affection status, by default, should be coded:
- -9 missing
- 0 missing
- 1 unaffected
- 2 affected
- column 7 onwards: Genotypes
- any character (e.g.: 1,2,3,4 or A,C,G,T or anything else)
- missing genotype:
0
- all markers must be biallelic (diploid). Either both alleles should be missing or neither. Haploid data: encode them as diploid homozygot. Two alleles are shown after each other.
If specially specified following columns can be missing:
Family ID
Individual ID
Paternal ID
andMaternal ID
Sex
Phenotype
AFLP data
Lumped
format+
band is present-
band is absent0
missing data
- data types can be mixed
Example
Lumped
data file:
NumIndivs 2 NumLoci 6 Digits 1 Format Lumped LocusNames sAAT1 sAAT2 sAAT3 ADA1 ADA2 ADH 1 11 11 11 0 11 32 2 21 11 21 11 11 12
NonLumped
data file:
NumIndivs 2 NumLoci 6 Digits 1 Format NonLumped LocusNames sAAT1 sAAT2 sAAT3 ADA1 ADA2 ADH 1 123 143 -1 -1 144 144 120 122 157 158 144 144 2 135 135 134 140 144 144 120 122 161 161 144 144
- AFLP data file (4 Microsat loci, 5 AFLP loci):
NumIndivs 2 NumLoci 9 Digits 1 Format Lumped LocusNames m1 m2 m3 m4 A1 A2 A3 A4 A5 1 11 12 13 11 + + + - + 2 22 33 11 22 - - 0 - - 3 12 13 13 11 + - - - +
How to cite
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.
ped.1307459917.txt.gz · Last modified: 2011/06/07 17:18 by heidi